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INTRODUCTION  AND  SUMMARY. 

In  this  work  we  consider  the  model 


yt  = Vt + aiVi  +-+Vh  ' 

where  oc^j=  0 and  we  often  assume  that  cc^  = 1.  The  e*  s are  inde- 
pendent normal  random  variables  with  zero  expected  values  and  constant 
common  variances.  The  e' s are  unobservable?  the  y*  s are  observable 
and  the  a'  s are  constants  (parameters).  For  purposes  of  theoretical 
analysis?  , we  take  t to  range  in  the  set  of  integers.?  so  that  (l) 
defines  a stationary  stochastic  process?  while  for  purposes  of  statistical 
inference  we  consider  a finite  set  of  equally  spaced  sample  values?  for 
t = 1?2?...?T?  in  either  case  we  call  (l)  the  moving  average  model. 

We  call  q (q  > 0)  the  order  of  the  moving  average?  and  in  many  cases 


the  statistical  arguments  require  that  the  ex'  s be  such  that  the  roots 
of  the  associated  polynomial  equation  a^z^  + cc^z^  +. . .+  = 0 be 

less  than  one  in  absolute  value. 


The  importance  of  the  moving  average  model  for  time  series  analysis? 
in  which  case  t is  Interpreted  as  time?  stems  from  several  facts.  Among 
them  we  note  the.  following: 

(a)  In  a variety  of  fields  of  application?  the  formulation  of 
reasonable  statistical  models  leads  to  moving  average  schemes?  or  more 
complicated  versions  of  them.  For  several  examples  see  Nicholls?  Pagan 
and  Terrell  ( 1973 ) » One  may  ascribe  part  of  the  potentiality -of  the 
moving  average  model  in  these  situations  to  its  structure?  which  postulates 
linear  combinations  of  current  and  past  error  terms  to  explain  the  random 
part  of  the  data. 
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(b)  The  autocovariance  sequence  has  zero  values  for  lag  lengths 
exceeding  q.  This  may  be  a reasonable  hypothesis  on  which  to  model 
empirical  phenomena. 

(c)  The  spectral  density  function  is  a real-valued  trigonometric 
polynomial.  As  such  it  can  approximate  the  spectral  density  function 
of  a wide  class  of  stochastic  processes  or  time  series. 

(d)  Due  to  the  relation  between  moving  average  and  autoregressive 
models,  which  we  consider  in  some  detail  in  Chapter  1,  the  moving  average 
model  may  on  some  occasions  provide  a competing  framework  with  similar 
properties  to  that  of  the  autoregressive  model  and  less  parameters  to 

be  studied  statistically.  This  is  important  because  the  linear  depen- 
dence of  a time  series  on  its  own  past  values  provides  another  empirically 
attractive  model. 

(e)  The  moving  average  model  is  a simple  case  of  a mixed  model  (auto- 
regressive with  moving  average  residuals).  Mixed  models  are  very  flexible 
tools  to  study  time  series  empirically,  and  provide  a general  approximation 
to  many  stochastic  processes.?  since  they  have  rational  spectral  densities. 
However  their  statistical  analysis  has  proved  very  hard,  due  mainly  to  the 
presence  of  the  moving  average  part. 

These  reasons  and  others.?  have  witnessed  in  recent  years  a growth  of 
proposals  to  estimate  the  parameters  of  (l).  Several  of  these  will  be 
reviewed  in  Section  1.4,  after  some  notation  is  developed.  It  will  then 
be  pointed  out  that  there  are  mathematical  difficulties  in  maximum  likeli- 
hood and  least  squares  estimation,  that  efficient  algorithms  need  be 
developed  if  one  is  to  follow  one  of  these  approaches,  and  that  some 
results  are  already  available  in  the  area. 
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On  the  other  hand  some  ''analog"  or  intuitive  estimators  were  shown 
to  be  highly  inefficient.  The  search  for  asymptotically  efficient  esti- 
mators led  to  consideration  of  procedures  that  operate  in  two  stages. 

The  mathematical  theory  for  these  is  also  complicated,  and  most  of  our 
efforts  are  devoted  to  provide  proofs  for  two  existing  proposals  of  this 
type.  Besides  filling,  in  a gap  in  the  literature,  we  try  to  gain  insight 
into  the  estimation  problem  from  this  basis. 

In  Chapter  1 we  define  the  models  derive  some  of  its  probabilistic 
properties  and  deduce  two  representations  related  to  the  autoregressive 
model  and  several  alternative  parametrizations . The  last  part  of  the 
chapter  contains  a brief  review  of  some  existing  estimation  procedures. 

In  Chapter  2 we  consider  the  possibility  of  using  k sample  auto- 
covariances (k  > q)  to  estimate  the  parameters  of  (l.).  Walker  (1961) 
studied  the  statistical  properties  of  a proposal  of  his  when  k is  treated 
as  fixed  and  T His  conclusions  and  examples  show  that  the  method 

Is  endowed  with  good  statistical  properties.  Under  his  approach  the 
asymptotic  distribution  of  the  estimators  depends  on  kj  by  studying 
the  effect  of  k on  the  parameters  of  the  distribution,  one  is  guided 
in  the  selection  of  a particular  value  of  k in  a practical  estimation 
situation. 

A different  approach  to  the  theory  is  to  let  k -» 00  as  well  as 

T °°j>  and  then  find  the  conditions  that  give  consistency,  asymptotic 

normality  and  efficiency.  This  is  done  in  Chapter  2 for  the  case  of 

q = !•  It  is  shown  (Theorem  2.3)  that  if  k = k(T)  dominates  log  T 

l/2 

and  is  dominated, by  T , then  the  estimator  proposed  by  Walker  is 
consistent  and  asymptotically  efficient.  (That  is,  it  achieves  the 
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asymptotic  variance  of  the  maximum  likelihood,  estimator.)  In  fact  the 
consistency  is  obtained  with  no  condition  on  k(T)  other  that  it  tends 
to  infinity  with  T (Theorem  2.l). 

The  approach  in  proving,  these  theorems  involves  obtaining  an  explicit 
form  for  the  components  of  the  inverse  of  a symmetric  matrix  with  equal 
elements  along  its  five  central  diagonals^  and  zeroes  elsewhere.  The 
derivation  of  these  results,,  and  related  material,,  appears  in  Mentz  (1972). 
There  exists  wide  interest  in  solving  the  mathematical  problem  of  finding 
these  explicit  inverses.  The  technique  that  gives  more  useful  results 
in  our  case  is  to  pose  difference  equations  for  the  components  of  the 
inverse.,  and  solve  them  explicitly. 

The  main  technique  used. to  prove  the  asymptotic  normality  of  the 
estimator.,  is  a central  limit  theorem  for  normalized  sums  of  random 
variables  that  are  dependent  of  order  k,  where  k tends  to  infinity 
with  T. 

As  a consequence  of  the  study  in  Chapter  2„  an  alternative  form  of 
the  estimator  is  presented  in  Chapter  3>  which  facilitates  the  calcula- 
tions and  the  analysis  of  the  practical  role  of  k,  without  changing  the 
asymptotic  properties. 

In  Chapter  4 we  consider  a different  approach  due  to  Durbin  (l 959 ) ^ 
based  on  approximating  the  moving  average  of  order  q by  an  autoregression 
of  order  k (k  > q).  This  is  also  an  appealing  estimation  proposal.,  be- 
cause. the  necessary  computations  involve  the  solution  of  standard  systems 
of  linear  equations,,  and  the  method  shows  good  statistical  properties. 

The  paper  by  Durbin  does  not  treat  in  detail  the  role  of  k in  the 
parameters  of  the  limiting  normal  distributions,  so  that  Chapter  4 is 
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devoted  to  this  topic  for  the  case  of  q = 1,  when  k is  treated  as 
fixed  and  T -»  <»•  We  derive  the  probability  limit  (Theorem  4.l)  and 
the  variance  of  the  limiting  normal  distribution  of  the  estimator 
(Theorem  4.2),,  and  compare  them  with  the  desired  values:  the  parameter 

in  (l)  and  the  asymptotic  variance  of  the  maximum  likelihood  estimator. 
The  differences  turn  out  to  be  exponentially  decreasing  functions  of  k, 
confirming  some  of  the  examples  presented  by  Durbin. 

The  parallel  analysis  with  k = k(T)  was  also  attempted,  but  at 
this  point  no  complete  proofs  are  available.  Instead  we  present  the 
limit  as  k of  the  parameters  of  the  limiting  distributions  as 

T oo  (Theorems  4.8  and  4.9).  In  the  case  of  the  parameter  of  interest 
these  limits  coincide  with  the  desired  values  mentioned  above. 

Finally  a modification  of  Durbin', s proposal  by  Anderson  ( 1971b)  is 
studied  in  detail  in  Chapter  5>-  also  for  the  case  of  q = 1.  The  modifi 
cation  simplifies  the  first  stage  of  the  procedure  by  using  some  of  the 
conditions  derived  from  the  underlying  moving  average  model. 
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1.  THE  MOVING  AVERAGE  MODEL 


1.1  Introduction. 


We  consider  the  time-series  model 


(1.1) 


where 


yt  = l a,et-i  * 

u j=0  J t J 


(1.2) 

(1.3) 


% = 1 ' 


a / o ; 

q 


the  sequence  (e  ) is  composed  of  independent  normal  random  variables, 
"u 

and  for  all  choices  of  t 


(1.4)  $ et  = 0 , 

and 

(1.5)  l = o2  , 

2 

where  0 < a < «.  Further  the  associated  polynomial  equation 

( 1.6)  f a.zq~^  = o 

A -j 

has  all  its  roots  less  than  one  in  absolute  value. 

If  we  think  of  t as  ranging  in  the  set  of  integers  {...,  -1,  0, 

1,  . ..},  then  (l.l)  defines  a wide-sense  stationary  stochastic  process, 
even  if  the  e ' s are  not  identically  distributed.  The  process  becomes 

"u  ^ 

strictly  stationary  when  we  assume  that  the  e+'s  are  identically  distri- 
buted. We  call  (l.l)  a moving. average  of  order  q. 
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We  note  that  when  q = 1,  (l.l)  reduces  to  the  simple  form 

(1*7)  yt = V+0!  Vi ' 

and  the  conditions  (1.2)  and  (1.3)  together  with  the  condition  on  the 
roots  of  (1.6)  reduce  to  0 < |aj  < 1.  We  shall  pay  much  attention  to 
(1.7)  since  the  mathematical  manipulations  simplify  considerably  in 
this  case. 

From  (l.l)  it  is  easy  to  see  that 


(1.8) 


£y,  = 0 ? for  all  t 


The  autocovariances  (or  simply  covariances)  of  the  y ' s are 

t 


(1.9) 


I s | < q ? 
I s | > q . 


As  expected.?  since  {y^}  is  wide-sense  stationary?  the  covariances 
do  not  depend  on  the  time  t.  Equation  (1.9)  is  written  in  full?  for 
s > 0?  as 
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v(0) 


2 2 2 
O ( 1+GI  + • • « + Q!  ) , 

1 a 


(1.10) 


a (l)  = a (a^-Kxa  +...+  a a ) . 
y 1 1 2 q-1  q 


• 2 

ff  ( q)  = a 0;  , 

y a 


cr  ( s) 

y 


= 0 , 


s = q+1  , q+2. 


The  autocorrelations  P ( s ) are  defined  by 

y 


(l.n) 


a (s) 

py(s)  = cr(0)  * lsl  = 1>2j, 

y 


o • • • 


For  example.,  when  q = 1 equations  (1.10)  reduce  to 


(1-12) 


a (0)  = cj2(M2)  , 

v 


ay(l)  = a a , 


a ( s ) = 0 , 
y 


S = 2,3 


2 « ® • 2 


and  equation  (l.ll)  gives 


/ n \ a 

Py(l)  * —2  ' 

-y  i+cr 


(1.13) 


y 


P (s)  = 0, 

y 


S I - 2j3  J « o o 


For  a real  the  function  a/ ( 1+a  ) attains  its  absolute  maximum  when 
0 = 1,  and  its  absolute  minimum  when  . ct  = -1.  It  then  follows  that  for 
■ I Gel  <1 
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(l.l!+) 


Py(1)l  *= 


1 

2 


For  arbitrary  q_  the  autocorrelations  are 


(1.15) 


PyU) 


1-1  s| 

I a.a  | I 
>o  J Jt|s| 


S J — 0^1,3  • • • } Q ) 


a 


3=0 


= o , 


s|  > q 


and  the  correlogram  (gra.ph  of  P (s)  against  the  time  differences  or 

y 

"lags")  has  the  typical  shape:  it  presents  possibly  nonzero  values  up 

to  lag  q_}  and  zero  values  from  there  onwards. 


1.2  Two  Exact  Representations. 

For  simplicity  we  illustrate  the  main  ideas  with  the  case  q = 1. 
From  ( 1.7),  by  successive  substitutions  we  obtain 


(1.16) 


et  - yt-°  v 


' Va(yt-r°  et-2> 


“ yt“ayt-l+CI  %-2 


that  isj 


yt‘“yt-l+Cl  yt-2 


■+  (-“)k  yt-k+  > 


(1.17) 


* 


3=o 


(-“)J  yt-3  = et,k  ' 
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where  we  define 


(1.3.8) 


"t.k 


If  we  think  of  a finite  set  y^#y 2#...#yT  of  random  variables 
corresponding  to  model  (1.7)#  then  equations  (1.17)  and  (l.l8)  above 
hold. for  t = k+l#...#T  and  any  k such  that  1 < k < T-l.  If  we 
think  of  t as  ranging  in  the  whole  set  of  integers.*  then  the  equations 
hold  for  all  t#  .and  k any  natural  number. 

It  is  clear,  that  (1.17)  and  (l.l8)  constitute  an  alternative  repre- 
sentation of  (1.7).  Its  importance  lies  in  the  fact  that  (1.17)  has  the 

form  of  an  autoregression#,  its  problem  lies  in  that  the  e*  , are  not 

is  ^ k 

uncorrelated#  when  the  are  as  in  (l.l). 

We  determine  the  first-  and  second-order  moments  of  the  e*  , . 

t#k 

From  (l.l8)  and  (1.4)  it  is  clear  that 

(1.19)  0 

for  all  relevant  t and  k.  Further 


(1.20) 


2 

6t-(k+l) 


a2( l+a2k+2) 


* 

that  is#  e has  a larger  variance  than  e . The  covariances  are 
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Cov 


t+s#k 


?P*  e* 

’ t#k  t+s#k 


= £ 


€ -(-cDk+1e 

t+s  ^t+s-(k+l) 


(1.21) 


= -(-a) 


k+l 


£€tet+s-(k+l)  + C£t-(k+l)et+s 


_,\k+l  / n \js. 

cr  { ~a)  = ( -i)  cr  a # 


k 2 k+l 


0 , 


k+l 


otherwise  . 


This  result  can  be  put  in  a clear  visual  context  by  introducing 
some  matrix  notation.  Let  us  define  the  vectors 


(1.22) 


€ = 


/ex\ 


\€T 


/'* 


e*  = 
~k 


k+l# ki 


_* 


\ T#k 


Then  from  (1.3)  and  (1.4)  we  deduce  that 


(1.23) 


€'  = a Im  » 


where  the  prime  denotes  matrix  transposition#  and  1^  is  the  identity 
matrix  of  order  T.  Similarly  (l.2l)  can  be  expressed  as 


(1.24) 


Fef  e*  - 


~k  ~k  w *T-k  v v ~k+i  * 


a2( -a)k"rlG.r 


where  the  matrix  £k+1  is  (T-k)  x (T-k)#  and  has  ones  along  the 

diagonals  in  places  (k+l)  above  and  below  the  main  diagonals  and 

zeroes  elsewhere#  if  g*!k+1^  denotes  the  i#  j-th  element  of  this 

ij 

matrix#  then 
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(1.25) 


,(k+l)  = 1 

3ij 


i-jl  = k+1  j 


= 0 } otherwise  . 


-Another  exact  representation  may  be  obtained,  by  letting  k tend  to 

infinity.  We  now  think  of  (y  } as  a stochastic  process  with  t ranging 

73 

in  the  set  of  integers.  When  q ■=  1,  from  (l. 17).  and  (l.l8)  we  have  that 


(1.26) 


I (-a)jy+_,-£ 


LJ=0 


t-j  t| 


2k+2  2 

= Q e 


t-(k+l) 


2 2k+2 
= a a 


which  converges  to  zero  as  k -» <»,  since  |ct;|  < 1.  Hence  we  write 

k oo 

(1.27)  €t  = ’ 

in  the  sense  of  convergence  in  mean  square  of  sequences  of  random  variables . 

For  general  q we  may  proceed  along  the  same  lines.  The  details  are 
given  in  Appendix  A. 


1.5  Alternative  Parametrizations . 

i 2 

The  moving  average  (l.l)  is  parametrized  by  er  and  the  coefficients 


0)-^  . . For  some  purposes  the  first  q+1  equations  of  (l.io)  provide 

an  alternative  useful  parametrization  in  terms  of  the  covariances  oy(0); 
cr  (l);...;(j  (q).  From  (l.ll)  it  is  easy  to  see  that  a (o)  and  the  auto- 

‘ y * y y 

correlations  p (l);...;P  (q)  are  an  equivalent  set  of  parameters. 

y y 

A general  argument  to  show  how  to  recover  the  Qh*s  from  information 
about  the  a (j)'s  is  given  in  Anderson  [(1971a);  pp.  224-25 ]j  a practical 

y 
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computing  routine  is  given  in  G.  Wilson  (1969)$  a discussion  of  the 
statistical  consequences  of  using  the  latter  appears  in  Clevenson  (1970). 


Some  authors  prefer  to  analyze  the  process  ( l.l) . through  its  spectral 
density.?  which  is  given  by 


9(w) ' & | ^0  a^e 


iwj 


■n  < w < it 


2 it 


a .a.,  e 


iw( J - j ' ) 


0=0  j‘=o  J J 


(1.28) 


„_i  £ 

2*  L 


xws 


s— ~q.  0=0 


ajaj  + !sl  e letting  s = J -j ' 


-i  f 

2 it  v 

s=-q  J 


/ s iws 
o (s)e 


using  (l*9) 


ay(0)  1 py(s)  e 


Hence  f (w)  can  be  expressed  as  a function  of  either  one  of  the 

tf 

sets  of  parameters  introduced  above.  Since  the  spectral  density  in  this 
case  satisfies  the  "inversion  formula1’ 

(1.29)  a (h)  = / cos(wh)  f (w)dw  , h = 0,  + 1,  + 2, . . . i 

y -'-it  y 

in  principle  we  can  also  recover  any  of  the  sets  of  parameters  once  f (w) 

y 

is  given.  The  practical  problem  of  recovering  values  of  parameters  in  some 
set  from  information  about  the  spectral  density.,  gives  rise  to  an  important 
avenue  of  estimation  procedures  for  this  model.  Some  of  these  are  reviewed 
in  Section  1.4. 
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1.4  Some  Estimation  Procedures. 


In  this  section  we  review  briefly  some  of  the  more  important  contri- 
butions to  the  problem  of  estimating  the  parameters  of  the  moving  average 
model  (l.l).  Reviews  of  estimation  procedures  are  contained  in  Hannan 
(1969)  and  Walker  (1961). 

To  organize  our  exposition  we  shall  attempt  to  separate  the  various 
proposals  into  categories  according  to  the  nature  of  the  basic  ideas 
involved.  Since  most  contributions  use  tools  corresponding  to  several 
lines  of  approach.,  the  categories  will  in  this  sense  be  far  from  exclusive. 

Throughout  this  section  we  consider  a sample  ° from 

(l.l).  For  the  sake  of  simplicity  many  remarks  are  referred  to  the  case 
q,=  1^  or  illustrated  by  means  of  it. 


1.4.1  Early  Work. 

Wold's  book  (1954)  is  a good  starting  point  for  this  review.,  since 
he  appears  as  the  first  in  attempting  to  estimate  the  parameters  of  a 
moving  average  process  [cf.  (1954).,  pp.  150-151]  • His  suggestion  can  be 
interpreted  in  our  notation  as  follows:  From  (1.28).,  letting  z = e1W., 

we  have  that 


(1-30) 


a z'- 


3=0 


= er 


3=0 


a.z1- 


> a.z 

fio  J 


= a (0)  2.  P„(s)z  • 
y s=-q  y 


The  P (s)’s  can  be  estimated  by 

y 


sT 


T-s 


(i.Ji)  rsT  r_SjT  cot  ’ °sT  T 


£ ytyt+S  ’ 


s — 0*  2 ^ o • * ^ c]_  p 


lb 


and  the  estimators  <X  „ solved  for  in 

J 


(1-32) 


3=0 


a.z'- 

3 


Ot  . Z 


j=0 


s=-q 


rsTZ 


For  example*  if  q - 1,  q = a*  . and  if  we  let  h = z+z"1*  (l»30) 
leads  to 


rfoT  (l  + az)(l  + az’1)  = 

y CT  (1-H3£  ) 


l^z+z"1)  + a2 


(1-33) 


, a 

1 + ~—p  h 

1-tor 


P (D 

JLr~  +i+  Py(l)  Z 


1+Py(l)h  , 


o 

so  that  the  desired  estimator  is  obtained  by  solving  r a'  -a+r1T  = 0 j 
the  only  admissible  root  is 


(1-34) 


This  estimator  is  consistent*  but  asymptotically  inefficient  compared 

with  the  maximum  likelihood  estimator  [see  Whittle  (1933)]. 

The  inefficiency  of  (1.34)  as  an  estimator  of  a can  be  ascribed. to 

that  of  r^T  as  an  estimator  of  p (l).  Hence  it  pays  to  try  to  improve 

the  estimation  of  the  autocorrelations*  some  suggestions  in  this  direction 

are  reviewed  in  sections  1.4.2  and  1.4.5  below. 

For  general  q*  the  problem  of  solving  (1.32)  for  the  cc.'  s has  been 

J 

considered  already  in  section  1.3.  See  also  Wold  [(1954) * pp.  123-132*  150-174]. 


1.4.2  Maximum-Likelihood  Estimation . 


When  the  s in  (l.l)  are  normal,  the  joint  distribution  of  the 
vector  y;  = ( y^, . • . , ) 1 generated  by  the  moving  average  process  is 

I y's”1  z)  > 

2 

where  L = £yy'  » Since  2 is  a function  of  the  ct  ' s (and.  of  a ),  (l-.35)> 
taken  as  a function  of  the  parameters  for  y fixed,  is  the  likelihood  function 

r*J 

of  the  observations . 

The  possibility  of  finding  the  maximum  likelihood  estimators  of  the 

Oh's  was  studied  by  Whittle  (1951),  (1952),  (1953)*  There  are  difficulties 
«J 

in  finding  explicit  forms  for  the  estimators,  .which  can  be  attributed  to  the 
complicated  nature  of  the  inverse  matrix  Z 1. 

For  q = 1 and  using  some  approximations,  it  can  be  shown  that  the 
maximum  likelihood  estimator  approximately  minimizes 


(1-35) 


exp 


(1.36) 


y 


Z 


l-a 


T „ T-l 

I yt  +2  l 
t=l  ■ u=l 


(-a) 


u 


T-u 

I 

t=l 


ytyt+u I 


see  e.g.  Durbin  (1959)*  The  estimate  can  then  be  found  by  means  of 

some  search  procedure,  e.g.  using  a computer  program.  For  most  values  of 

q . the  search  for  the  minimizing  set  of  Q.'s  may  be  quite  cumbersome, 

J 

as  has . been  noted  repeatedly  in  the  literature. 

The  asymptotic  theory  of  the  maximum  likelihood  estimators  was 
explored. by  Whittle  (l95l)>  (1952),  (1953)°  He  have  arguments  to  support 
his  claim  that,  asymptotically,  the  same  behavior  as  in  the  case  of 
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independent  sampling,  from  a "regular'5  distribution  will  be  achieved.  It 
may  be  worthwhile  to  review  Whittle's  initial  contributions*  since  some 
confusion  seems  to  exist  in  the  literature. 

Whittle  [ ( 1953 * pp«  426-427)]*  argued  towards  the  consistency  of  the 
maximum  likelihood  estimators!  he  then  considered  the  distribution  of  the 
maximum  likelihood  estimators  and  noted. that  it  is  "...  distributed  in  the 
same  fashion  as  if  the  sample  material  had  consisted  of  [t]  independent 
variates  with  [a  given],  frequency  function  p(x)  ..."  so  that  "...  with 
the  aid  of  this  equivalence*  estimator  properties  such  as  efficiency*  etc., 
may  be  established  simply  by  referring  back  to  existing  theorems  for 
independent  series"  (pp.  427=428).  This  part  of  his  work  must  be  regarded 
as  providing  an  informal  argument!  cf.  Hannan  [(i960)*  footnote  on  page  46]. 
Finally  WThittle  shows  that  the  maximum  likelihood  estimators  are  the 
consistent  estimators  with  minimum  asymptotic  variances  among  those  satisfying 
a certain  estimating  equation  that  is  basic  in  his  work  [(1953)*  equation 
(2.8)*  page  428 ] . 

There  has  been  considerable  work  to  give  formal  detailed  proofs  of 
these  and  other  related  statements.  Among  others  see  Whittle  (1961)* 

Walker  (1964)*  who  gives  a proof  of  consistency  and  asymptotic  normalitjr* 
Ibragimov  (1967)*  who  treats  consistency*  Dzhaparidze  (1970 )*  who  treats 
the  closely=related  case  of  a continuous  time  parameter*  and  references 
therein. 

One  important  consequence  of  these  researches  is  that  under  suitable 

regularity  conditions*  the  maximum  likelihood. estimators  of  the  a.'s  behave 

J 

asymptotically  like  similar  estimators  for  the  parameters  of  an  autoregressive 
model  of  the  same  order. 


I? 


Under  the  present  heading  we  also  include  Walker's  (1961)  proposal* 


that  he  regards  as  " . . . a modification  of  Whittle' s method  which  enables 
[some  of  its]  difficulties  to  he  avoided  to  a large  extent*  and  also  usually 
requires  much  less  compute 't ion"  (page  345).  He  uses  the  maximum  likelihood 
approach  to  search. for  the  asymptotically  efficient  estimators  of  the. auto- 


correlations p (l)*...*P  (q)*  ■ and  the  sample  information  is  used. through 

r.  * j = 1*2*  . . .* q+k*  k > 1.  Walker's  proposal  will  he  studied  in  some 
J T 

detail  in  chapters  2 and  3 . For  a review  of  his  work  see  also  Anderson 


[(1971a)*  Section  5.7.2].  Walker's  paper  also  contains  a review  of 


Whittle's  contributions  in  this  area. 

The  estimation  of  the  autocovariances  a (s)*  s = 0*1*... *q  by 
maximum  likelihood  has  been  approached  also  from  the  point  of  view  of  the 


relation  between  this  problem  and  that  of  estimating  a covariance  matrix 
of  special  structure  in  multivariate  normal  sampling.  Anderson  (1971b)* 
(1973)  derived  an  iterative  procedure  which  attempts  to  obtain  efficient 
estimates  of  the  a (s)'s. 

y 

Recently  Box  and  Jenkins  (1970)  presented  computational  approaches  to 
find  the  maximum  likelihood  estimates,  as  will  be  mentioned  below. 


Closely  related  to  the  maximum  likelihood  approach  is  the  least-squares 

estimation  procedure  for  this  case.  Least  squares  estimation  of  the  OL  's 

d 

leads  to  nonlinear  equations*  which  can  be  solved  by  special  computer  techniques* 
see-  e.g.  Pierce  (1970).  This  author  studied  the  asymptotic  properties  of  the 
least  squares  estimates  of  the  parameters  of  a moving  average*  and  one  main 


18 


conclusion  is  that  they  are  those  of  the  least  squares  estimates  of  the 
parameters  in  a corresponding  autoregressive  model  of  the  same  orders 
i.e.  the  same  kind  of  duality  we  noted  for  the  maximum  likelihood  estimators « 
The  connection  is  not  surprising  since  (1.36)#  the  approximate  equation 
to  be  solved  for  the  maximum,  likelihood  estimators#  is  also  the  least  squares 
estimators  criterion  equation#  see  Walker  (1964)#  or  Box  and  Jenkins 
[(1970)#  Chapter  73-  These  latter  authors  analyze  in  detail  the  computa- 
tional problems  associated  with  (1.36)#  and  also  present  an  analysis  of 
the  exact  likelihood  function.  One  can  say  that  for  finite  samples#  the 
difference  between  using  (1.36)  and  the  exact  likelihood  arises  because 
one  approximates  £ ^#  and  further  neglects  the  determinant  in  (1.35)# 
which  appears  in  goipg  from  the  independent  e^'  s to  the  y^'s. 


1.4.4  Estimation  Based  on  the  Finite  Autoregressive  Approximation. 

In  section  1.2  it  was  shown  that  a moving  average  process  admits  a 

representation  as  a finite  autoregression  with  correlated  residuals. 

Durbin  (1959)  used  these  Ideas  to  derive  an  estimation  procedure  for  the 

G.®s#  his  work  will  be  considered  in  detail  below.  For  a review  of  this 
J 

work  see  Anderson  [( 1971a)#  Section  5-7*2]° 


1.4.5 


A group  of , papers  has  been  written  in  the  area#  where  the  main  stress 
lies  in  looking  at  the  parameters  as  forming  the  spectral  density  (1.28)) 
alternatively  one  says  that  one  resorts  to  the  Fourier  transform  of  the 
available  data.  Some  of  these  suggestions  have  resulted  in  rather  complicated 
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expressions.,  frequently  to  be  solved  by  means  of  the  computer > but  some 
seem  to  suggest  ways  for  estimation  in  more  general. cases:  mixed  models., 
vector  cases.,  etc.  Most  of  the  procedures  are  iterative.,  and  aim  at 
obtaining  (asymptotically)  efficient  estimators. 

Durbin  (1961)  presented  what  he  calls,  "a  spectral  form"  of  his 
earlier  suggestion^  the  one  we  reviewed  in  section  1.4.4.  Hannan  (1969)^ 
(1970),  and  Clevenson  (1970)  also  have  papers  in  this  areaj  the  former 
concentrates  on  the  a.'s  and  the  latter  on  the  a (s)'s.  For  a recent 
review  of  this  work  see  Parzen  (1971)* 
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2.  ESTIMATION  BASED  ON  A FINITE  NUMBER  OF  SAMPLE  AUTOCORRELATIONS. 
ASYMPTOTIC  THEORY  WHEN  THE  NUMBER  IS  A FUNCTION  OF  SAMPLE  SIZE 


2.1  Introduction. 


Walker  (1961)  proposed  a procedure  to  estimate  the  parameters  of  a 
moving  average  model  of  order  q.  He  considered  the  vector  of  auto- 
correlations P = ( P ( l),  . • o)  P ( q)  )'  . 

With  the  notation  used  by  Anderson  [(1971a),  Section  • 5 »7<>2],  the 
final  form  of  the  estimator  is 


(2.1) 


A (1) 

P = r'  -W 
ST  "P  *12 


W-'5' 
Arp  A22 


rU)  r(2) 


If  r^  denotes  the  vector  whose  components  are  the  first  k sample 
autocorrelations  (q  < k < T)  defined  as  in  Section  1.4.1  by 


(2.2) 

r = , 

JT  C0T 

j - 1, 2,  • . . , k , 

where 

(2.3) 

1 T-j 

c = - y 

jT  T 

ytyt+j  ’ 

j = 0, 1, . . . , k , 

then  rm 

is  partitioned  as 

4 = .(^ 

(2>\  (1) 

) where  rT  has  q 

components,  ■ and  has  k-q  components.  W = W(p)  is  the  covariance 

matrix  of  the  limiting  normal  distribution  /t  (41}=£)  [see  e.g. 
Anderson  ( 1971a),  Section  5«7»3]>  and  it  is  partitioned  to  conform  with 


(WT1  w-,  O 

~11  ~12 

~21  ^22 
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replaced  by  the 


by  we  mean  (2.4)  with  the  components  of  p 

corresponding  ones  of  rl"^ . Note  that  pi  inn  r m 

~T  T-»oo  sT 

s > q. 

When  q = 1,  W(r^)  = W(rlrp)  and  is  given  by 


P (s) 

y 


= 0,  if 


IT 


(2-5)  W(r) 


2 , 4 
1-3 r +4r 

2r( 1-r^) 

2 


0 

0 


2r( 1-r^) 

2 

r 

0 • o © 

0 

0 

2 

1+2  r 

2r 

2 

1C  • o • 

0 

0 

2r 

• 

2 

l+2r 

0 

2r  • . • 

o 

0 

• 

0 

• 

• 

0 

• 

0 

0 

o 

0 o • 0 

• 

■ 2 

1+2  r 

2r 

0 

0 

0 . . • 

2r 

l+2r' 

where  for  simplicity  we  write  r1T  = r.  Then  (2.1)  becomes 


w 


11 


. w 


l,k-l 


a 2 2 ° 
P,p  it*  2r(  1 “i*  ) ^ i*  ^ 0^  o • • ^ 0 • 


■2T 


(2.6) 


i w 


k-1,1  k-l,k-l7Vr. 


w 


kT  ‘ 


, 2,  1V1  lj  2 kf1  2j 

= r-2r(l-r  ) | w Jr.+1;T-r  E w rJ+1<T  , 

, J J 

“1  " ° 

where  we  have  denoted  Wg2(r).=  (w1J).  Note  that  (2.6)  can  also  be 
written  as 


(2.7) 


k-1 


2|Jl  I mm(  0 ) ^-i+l 

j=o  J J 


T 


be  defining 
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(2.8) 


rn^O)  = 1 , 

™rp(j)  = “ 2r(  l-r^jw1^  - r2w2j  , j = l,2,...,k-l  . 

Walker  developed  the  asymptotic  theory  for  this  proposal  when  k is 
treated. as  fixed.  In  the . following  sections  we  present  the  corresponding 
asymptotic  theory  when  k = a function  of  the  series  length  T,  such 

that  lim^^  k^  = «>.  We  restrict  our  attention  to  the  case  q = 1.  It  was 
conjectured  by  Walker  [( 19.61).,  page. 353]  that  such  a theory  could  be 
developed.,  essentially  by  means  of  the  tools  we  use  below,  except  that  the 
components  of  W^  will  be  evaluated  explicitly. 

2.2  Evaluation  of  the  Components  in  Two  Rows  of  the  Inverse  Matrix,.. 

From  (2.4)  and  (2.5)  we  see  that 

(2.9)  ^(r)  = (l+2r2)l+ 2r(J1+r2S2  , 


and  the  G.  matrices  were  Introduced  in  Section  1.2.  From  now  on,  for  con- 
venience,  we  take  the  order  of  W22  to  be  k^  (sometimes  denoted  by  k) 
instead  of  k^-1.  The  evaluation  of  the  components  of  W^2  is  treated  in 
Mentz  [(1972),  Section  4].  To  evaluate  (2.6)  we  only  need  the  first  two 
rows  of  Wgg,  or  equivalently  the  first  two  columns  since  W22  is 
symmetric . Let 


(2.10) 


2 2 
a = l+2r  ,b=2r,  c=r  , 


so  that 


(2.11) 


W__  = a I + b G,  +cG0 
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We  assume  throughout  that  jr|  < — , a condition  that  P (l)  was  shown 
to  satisfy. 

The  associated  polynomial  equation  that  corresponds  to  this  problem  is 


(2.12) 


4 3 2 

cx  + bx  + ax  + bx  + e = 0 , 


and  has  roots 


(2.13) 


X1  = X2 


1 + yl-4r£ 

2r 


2r 


1 - -J l-4r2 


(2.14) 


x3  “ x4  " 


1 - 

2r 


2r 


1 + / l-4r2” 


Hence  (2.12)  has  the  roots  x^,  x^  , each  with  multiplicity  two,  where 
jx^l  <1.  It  then  follows  that  the  components  or  are  given 


(2.15)  wlJ  = [C1( j)  + i C2( j)3x^  + [C5( j)  + i CJ+( j^x"1  , i>  j = 1,-...,]^. 


The  constants  C (j)  in  (2.15  ),  for  columns  j = 1,2,  are  evaluated 
s 


from  the  matrix  equations 


(2.16) 


A C(l)  = 


\ 


A £(2)  = 


where  C(j)  = (C^j),  C2(j),  C^(j),  C^(j  ));’..  In  terms  of  partitioned 
matrices,  the  solutions  of  (2.16)  are 
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(2.17) 


(2.18) 


The  components  of  A1 , are 


= ax^  + bx^  + cx^  = (r/2)  | J l-4r2  - jj 


a...  = ax„  + 2bx?  + 3 ext?  = - (r/2' 

.XX  -L  X X 


) f> 


iXr  +1  , 


(2.19) 


. . 2 3 . -4  2 

a = bx^  + ax^  + bx^  + cx_  = - r , 


art,.,  = bx„  + .28x7  + 3bx^  + 4cx^  = 0 
22  1 11  1 


The  components  of  A.j0  are  of  the  same  form  as  those  in  (2.19)>  with  x^ 


replaced  by  x^ 


^lAl 


(2.20) 


( ^2 1 ^ i p 


(^21.)  21 


^21^22 


The.  components 
. k 

of  A„  - 

~21 

are: 

, -2 

-5 

- xL  a^,  a?1  - 

b + ax_( 

+ bx„, 

■ 1 

+ cx1  , 

, k 

, , k-1 

-1 

k-2 

^ -2 

- kx.j  a^2JI  aj2  : 

= b + —— 

bx^  + 

k 

axl  + 

k 

1 

= xi  a4l-s  a4l  = 

a + bx 

-2 

+ CX1 

3 

, k 

, k-1 

-l 

k-2 

-2 

- kxx  a1+2J>  ab2  : 

= a t — ; — ■ 
k 

bxi' + 

k 

CX1  * 

k-3  -3 

~cxi  ' 
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The  components  of  are  of  the  same  form  as  those,  in  (2.20),  with  x 

replaced  by  x-^1. 

By  the  niL  es  of  partitioned  inversion 


(2.21)  Ai]"  = (A  - A__  A“i  A.,)-1  . A21 

' ~ x,~ll  ~12  ~22  ~21'  9 ~ 


-1  11 
A^  A A , 

~22  ~21  ~ ' 


and  the  matrices  in  (2.2l)  can  be  written  as 


/ a22  + b22  Xlk  + d22 


(2.22)  A 


11  1 


, 2k  1 2k 

"a21.  “ b21  X1  “ C21  kXl 


2k  ^ , 2k 

l12  "b12  X1  ”d12  ^1 


, . 2k  1 2k, 

all  bll  X1  C11  k X1 


(2.23)  A 


21  xl 


2k 

'1_ 

kA 


mllk+nllfalk+slll!2+tllk2x?  n12i+n12kxf+s12k2+t12k2,tl! 


, 2k  , , , , a 

“21+n2li  +s21k+t21lttl 


m22+n22Xl  +S22k+t22kxl 


where 


(2.24) 


Pk  1 Ilk  1 

a - hi + xi  (h2 + y + 5 v + xi  y + y + 1 y 


(2.25) 


hl  ' ail  a22  • ai2a21  + ° 


The  b.  c. d„ . , m„ n.„,  s„„,  t. .,  and  h„  In  expressions  (2.22) 

10  lj  ij  lj  lj  ij  ij  i ^ ' 

- (2.25),  are  either  linear  combinations  of  the  original  a, . defined  in  (2.19) 

i J 

and  (2.20),  that  do  not  involve  k^,  or  at  most  functions  of  k^  through 

factors  like  (k^,-s)/k^,  for  s = 1,2  or  3»  Note  however  that  in  general 

they  are  random  variables, . functions  of  r^  . 

For  our  purposes  there  is  no  need  to  specify  the  C (j)’s  (j=l,2)  in 

s 


greater  detail. 
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In  the  case  of  q.  = 1,  from  (2.6)  or  (2.7)  we  see  that  to  prove  the 


consistency  of  Walker's  estimator  of  Pr(l)  It  suffices  to  show  that 


(2.26) 


V 


■1 


plii 


I VJ)  rJ+i, 

ej  -k 


T 


this  will  be  done  now. 


Theorem  2.1.  Let  y satisfy  equation  ( 1.7 ) for  t = . . ., . -1, 0, 1, . . . , 

where  0 < j CK  | < 1 and  the  e+  are  inde pendent , normal  with  ^e,_  - 0 j, 

2 2 2 

^ et  ~ a (^  < cr  <“)  for  all  t.  Suppose  that  a set  of  observations 
of  {y_J  at  times  t = 1,2,  ...,T  is  available,  and  that  k=k  is  a 

— ' X;  ' L .1  ^ r n.  . - rp 

function  of  T (T  > k+l),  satisfying 

(2.27)  lia5r_^  ^iji  “ 00  * 

Then,  if  is  defined  by  (2.7), 

(2.28)  P1^*,  PT  = PyU)  • 

Proof.  Let  us  take  the  w1J  = witJ(r),  j = 1,2,  in  the  definition  of  the 

estimator,  as  those  evaluated,  in  Section  2 when  Is  taken  to  be  of 

~22 

order  k_,  since  their  difference  with  those  when  is  of  order 

T ~22. 

kpp-1  is  negligible  as  T -»  «>.  Then  for  j = 1,2,  ... ,1^-1  we  have 


that 


- m^(  j ) = 2r  ( 1-r2 ) w^  + r2  w2"J 

= |sr  (l~r2)[C1(l)  + jC2(l)]  + r2  [0^2)  + JC2(2)]|  x^_ 
(2.29)  +|2r  (l-r2)[CV(l)  + JC^l)]  + r2  [^(2)  + ^(2)]} 

= ^2r  ( 1-r2) (a11  + j a21)  + r2  (a12  + j a22)|  xj 


f„  2w  31  ^ . 4iv 

■•i  2r  ( 1-r  )( a + j a ) 


+ r2  (a32  + j a42)j-  x“J  , 


where  the  a1J  are  given  in  (2.22)  and  (2.23). 

Replacement  in  (2.2 6)  gives  two  corresponding  terms.  The  one  associated 
with  the  second  braces  of  (2.29)  is  easily  shown  to  converge  to  0 in 
probability,  because  the  a-1^  have  as  dominating  f actor j see  (2.22) 

and  (2.23).  The  term  associated  with  the  first  braces  of  (2.29)  is 
handled  differently:  for  any  fixed  number  of  initial  summands  in  it,  it 
can  be  used  that  plim^^  r^  = 0 for  j > 1,  while  for  large  enough  j 
the  exponentially  declining  x^  is  relevant,  even  considering  that  the 
number  of  terms  increases  with  T.  The  details  are  given  in  Section  7»1* 


2.4 


In  this  section  we  prove  that  when  the  estimator  of  p.  (l)  proposed 

y 

by  Walker  is  based  on  k sample  autocorrelations,  and  k is  taken  to  be 
a function  of  T,  it  still  has  a limiting  normal  distribution.  We  first 
state  two  lemmas. 


Lemma  2.1.  Let  0 < a < 1,  T=l,2,»..  and  k^  be  a function  of  T 

such  that  lim^^  k^,  = <».  Let  n and  m be  positive  constants.  Then 
a necessary  and  sufficient  condition  that 
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(2.30) 


is  that 


(2.31) 


li: 


V 


Kp 


o . 


Lemma  2.2.  Let  the  sequence  of  random  variables  {Z^}  converge  in  distri- 
bution to  the  random  variable  Z.  Suppose  that  the  sequence  { Y ) con- 
verges in  probability  to  0.  Then 


(2.32) 


PlimT-^co  ZT  YT  = ° 


These  lemmas  are  standard  results  in  analysis  and  probability  theory* 
respectively*  and  will  be  proved  directly  only  for  the  sake  of  completeness 
The  proofs  constitute  Section  7°2. 

The  theorem  we  shall  prove  in  this  section  is  the  following: 

Theorem  2 .3 » Let  the  conditions  of  Theorem  2.1  hold*  together  with 


(2.33) 


lim. 


T-»oo 


0 3 


Lim  — = 0 . 
T-3>°°  T 


Then  v/T  (p  -P  (l))  , has  a limiting  normal  distribution  with  parameters 

rf  i " 

0 and  (1  <X2Y/{±KT)  . 


Proof . The  proof  of  the  theorem  will  be  done  in  five  parts*  as  follows: 
Part  1 o (Replacement  of  sample  autocorrelations  by  sample  autocovariances) 
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kT 

PT“Py(l)  ^ l ^T"Py(l) 


'OT  j=l 


kT 

I ^(j-l)(c1T“  I'c  ) 


fClT  ffy(l) 


jT  jT  cqt 


(2.34) 


'OT 


_T  0 (1) 

An  J“1)(CjT"^CjT)  “ a(o)  ^COT=^COT^  + 


J=1 


y 


Cr'^1)  jrCOT 


1 ^ 1 
™ £n  ^(j"1XCjT“£c.jT)  T ~~e)  * 


'OT  ,1=0 


OT 


where  we  define 


(2.35) 


a 


a (I) 


In  the  last  line  of  (2.34),  we  can  replace  cQrn  by  py(o)  ~ Plimiji_>00  cqt^ 
without  affecting  the  resulting  limiting  distribution  [cf.  Rao,  ( 1965 ) , 
Section  6a. 2] . Also  note  that  plim^^  ^/t  ( l/T) [ct  ( iJ/cq^,]  = 0. 

Hence  the  conclusion  of  this  part  of  the  proof  is  that  yr  (L-P  (1)) 


T y 


has  the  same  limiting  distribution  as 


(2.36) 


1 ^ 

^ ^oj  * V'1“1^CjT“^CjT) 


Fart  2.  ( Simplifying  the  m^  j ) 1 s ) . 

We  have  that  n^(-l)  = = P (l),  m^,(o)  = 1,  and  m^(j)  is  given  by 
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(2.29)  for  j = 1,2, . . .,k^-l.  From  the  argument  in  the  proof  of 
Theorem  2.1  we  see  that  we  can  write.?  say? 


(2.37)  mT(  j)  = m^T(j)  +1^^)  , j = 1,2,  ...?hT-l? 


where  0 < A,  < 2.  We  want  to  argue  that  we  can  disregard  the  part  with 
A,km 

X1T  ™ as  a factor?  and  then  find-  an  explicit  form  for  m (j).  This 

1?  T 

is  done  in  Section.  7*3  *2. 

The  conclusion  of  this  part  of  the  proof  is  to  assert  that  it  suffices 
to  find  the  limiting  distribution  of  (2.36)  when  each  nu(j)  is  replaced 
by  ml?T^^  given  by 

ml  Jj)  = “ * 

1-KX2 

(2.38) 


J = -1 


- XJ 

X1T 


^ j V $ j — 0^  1^  • • * f k^-l  • 


Here  of  course  r = r1T  and  x1T  = x1  (r1T)  are  random  variables. 

Part  3.  (Substituting  parameters  for  random  variables  in  the  m (j)'s). 


Here  we  prove  that 


V1 


(2.39) 


PlimT^ro/T  I [m1^T( J”l)  ‘ J“DKcjt-  ^cjT) 


Y 


= P11^  /T  I t^T(J)  - “tJ)3  =j+1,T  - 


= 0 


where  we  used  that 


Our  notation  is: 


JT 


0 for  j = 2,3, 

' 1T;  ” "y' ' “l 


r = rw  P = P (1),  x,  = x 


IT 


^2  ^ ■'■"j.T  ^ } 


X1  = X1^P  (■*-))  = “ a*  Now: 

y 
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so  that  the  random  variables  in  (2-39)  will  be  taken  to  be  formed,  by  the 
corresponding  two  terms . 

The  sum  over  i of  the  first  term  is  of  the  form  (7*23)  treated. in 


Section  7*3  *3  > namely  /T  V j x^  c . 

j=l  1 


Since  ' = (-o)^  is 


suramable  ( Jo:  ( < l),  the  sum  over  j converges  in  distribution  to  a 
normal  random  variable  with  zero  expected  value  and  finite  variance. 

Further  Jl  -4r^  ^ Jl-k&l)  as  T -» oo  f so  that  the  second  summand 

ii  y 

converges  stochastically  to  zero,  by  Lemma  2.2.  In  the  second  term  we 


have  to  deal  with 


kT=1  . 

(2.kl)  /T  I (xJT  - Xf)  cj+hT  , 

or  this  same  expression  with  weights  j(x!^  ” x^).  see  "*4ia’*-'  Pro°f 
will  be  completed. if  each  such  term  converges  stochastically  to  zero.  We 
treat  the  case  of  (2.4l)  in  detail,  since  for  the  other  one  a parallel 
argument  holds.  The,  algebraic  steps  are  presented  in  Section  7 *3 *2. 

The  consequence  of  this  part  of  the  proof  is  that  instead  of  (2.36) 
we  now  must  prove  that 
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(2.42) 


kT 


^ToJ 


has  the  limiting  normal  distribution- claimed  in  the  theorem. 

Part  4.  (The  asymptotic  normality). 

Let  be  the  random  variable  in  (2.42).  Substituting  for.  the 

c^T«s  from  (2.3).  of  Section  2.1,  we  have  that 


(2.43) 


T-j 

flT  ■ vro7  ^ .1  “(J-1)  I -?ytW 


1 T 

~ Z wtT  > 

/t  t=l 


where 


(2.44) 


k 

\t“  J0  (ytyt+J-?ytyt+j) 


t - 1;2; . • .^T-k^ 


T-t 


^0  (ytyt+J'fytyt+3) 


t = T- 


V1 


2 9 • • p T • 


In  Section  7 -3 *3  we  argue  that  (2.45)  is  asymptotically  normally 
distributed  with  parameters  0 and 


(2>J+^  T = limr_>0O  + 2 W1T  W2T)  . 

Part  5 • (The  asymptotic  variance). 

To  complete  the  proof  it  suffices  to  show  that  in  (2.45) 
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where  the  expectations  in  (2.45)  are  given  by 

1 ^ kT 

(2.47)  <fwtT  Wt+S;T  - Io  3,I0  dJa,(s)  , 

using  the  d„  .( s)  introduced  in  expression  (7*27)  of  Section  7*3»3* 

The  evaluation  of  t is  presented  in  detail  in  Section  7 *3  *5  • 

The  conclusion  of  Theorem  2.3  can  easily  be  used  to  prove  the 
followings- 

A 

Corollary  2.4.  Under  the  conditions  of  Theorem  2.3>  let  be  defined 

by  (1.34)  with  r^,  replaced  by  p^.  Then  /t  (a^-a)  has  a limiting 

2 

normal  distil  bution  with  parameters  0 and  1-Ct  • 

Hence  we  showed  that  under  the  stated  conditions.,  the  procedure 
in  this  chapter  achieves  asymptotically  the  variance  of  the  maximum 


likelihood  estimator. 


3 • ESTIMATION  BASED  ON  A FINITE  NUMBER  OF  SAMPLE  AUTOCORRELATIONS ♦ 
A MODIFICATION  TO  SIMPLIFY  THE  COMPUTATIONS 

From  the  argument  in  Chapter  2 it  fellows  that  Walker's  estimator 
°f  P (l)  for  the  first-order  moving  average.,  given  in  (2.7)  as 


(3-1) 


k-1 

^T  = j=o  rj+1>T  * 


is  asymptotically  equivalent  to  the  estimator 


(3-2) 


p* 

T 


k-1 

= l 

j=o 


ml,T(j} 


' j+ljT 


where 


(3-3) 


mi/r^' 


and 


(3.4) 


The  modified  estimator  P*  discards  from  parts  having 

as  a factor^  and  hence  differs  only  slightly  from  if  k is 

mo de ra t e ly  large. 

To  compute  (3»l)  Walker  [(1961);,  pp.  34-7-348] t proposed  an  iterative 
procedure.  The  form  (3*2)  is  of  course  much  simpler^  and  reflects  also 
the  fact  that  the  necessary  components  of  the  inverse  matrix  W”g  have 
been  obtained  in  closed  form. 
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From  a practical  point  of  view  the  form  (3.2)  makes  easy  the  choice 
of  k,  guided  by  the  degree  of  numerical  approximation  that  is  desired. 
In  fact  approaches  zero  fast  as  j increases^  and  jx^  increases 

until  j reaches  a value  approximately  equal  to  -( in  | x^,  ) and  then 
decreases.  Consider  the  Table  3*1.5 


Table  3 » 1 

Values  of  nr  (j)  for  selected  values  of  r 

Is  1 


1 

• 05 

.10 

.15 

.20 

.25 

1 

-.1000000 

-.2000000 

- .3000000 

-.4000000 

-.5000000 

2 

.0075125 

.0302030 

.0685482 

.1234089 

.1961524 

3 

-.0005018 

-.0040612 

-.0139772 

-.0340895 

-.0692193 

4 

.0000314 

.0005123 

.0026761 

0 0088540 

.0230114 

5 

-.0000018 

-.0000620 

-.0004922 

-.0022109 

- .0073620 

6 

.0000001 

.0000073 

.0000880 

, .0005372 

.0022931 

7 

0.0000000 

-.0000008 

-.0000154 

-.0001279 

-.0007003 

8 

0.0000000 

0.0000000 

.0000026 

.0000300 

.0002106 

9 

0.0000000 

0.0000000 

-.0000004 

-.0000069 

- .0000626 

10 

0.0000000 

0.0000000 

0.0000000 

.0000015 

.0000184 

11 

0.0000000 

■1  0.0000000 

0.0000000 

- .0000003 

-.0000053 

12 

0.0000000 

0.0000000 

0.0000000 

0.000000:0 

-.0000015 

13 

0.0000000 

0.0000000 

0.0000000 

0.0000000 

-.0000004 

l4 

0.0000000 

0.0000000 

0.0000000 

0.0000000 

.0000001 

15 

0.0000000 

,0.0000000  , 

0Q000000Q 

. 0 .0000000 

' TO. 0000000 

3 6 


Table  3 »1  ( C ont inue d ) 


1 

.30 

-35 

.4o 

.45 

1 

-.6000000 

-.7000000 

-.8000000 

*.9000000 

2 

.2888888 

.4049504 

.5500000 

.7353557 

3 

-.1259259 

•*.2140023 

-.3500000 

-.5682477 

4 

.0518518 

.1072520 

.2125000 

. .4234477 

5 

-.0205761 

-.0519085 

- . 1250000 

-.3075804 

6 

.0079561 

.0245097 

.0718750 

.2192185 

7 

-.0030178 

-.0113615 

-.0406250 

-.1539701 

8 

.0011278 

.0051919 

.0226562 

.1068903 

9 

-.0004166 

-.0023457 

-.0125000 

-.0735060 

10 

.0001524' 

.0010500 

.0068359 

.0501521 

ll 

-.0000553 

-.0004664 

-.0037109 

“.0339916 

12 

.0000199 

.0002058 

.0020019 

.0229082 

13 

-.0000071 

-.0000903 

- .0010742 

-.0153631 

i4 

.0000025 

.0000394 

•0005737 

.0102590 

15 

- .0000009 

-.0000171 

-.0003051 

- .0068249 

16 

.0000003 

.0000074 

.0001617 

.0045251 

17 

-.0000001 

-.0000032 

-.0000854 

-.0029913 

18 

0.0000000 

.0000013 

.0000450 

.0019721 

19 

0.0000000 

-.0000005 

-.0000236 

-.0012970 

20 

0.0000000 

.0000002 

.0000123 

.0008511 

21 

0.0000000 

-.0000001 

- .0000064 

-.0005574 

22 

0.0000000 

0.0000000 

.0000033 

.0003643 

23 

0.0000000 

0.0000000 

-.0000017 

-.0002377 

24 

0.0000000 

0.0000000 

.0000009 

.0001549 

25 

0.0000000 

0.0000000 

-.0000004 

-.0001008 

26 

0.0000000 

0.0000000 

.0000002 

.0000654 

27 

0.0000000 

0.0000000 

-.0000001 

-.0000425 

28 

0.0000000 

0.0000000 

0.0000000 

.0000275 

29 

0.0000000 

0.0000000 

0.0000000 

“ .0000178 

30 

0.0000000 

0.0000000 

0.0000000 

.0000115 

For  r negative  the  values  of  m m(j)  are  those  of, Table  3.1 

XX  JLj>  1 

all  taken  with  positive  signs. 

Once  the  estimating  value  of  r,  is  available,  the  table  can  be 
used  to  decide  how  many  autocorrelations  r.,  j = 2,3;  •••  to  include 
in  the  correction  of  given  by  (3*2). 

The  main  points  discussed  in  this  chapter  can  be  summarized. as 
follows . 
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Theorem  3 ♦ 1 • Under  the  conditions  of  Theorem  2.1,  let  p*  be  defined 


in  (3.2).  Then  PT  = P (l) 


A# 

Theorem  3 <>2 » Under  the  conditions  of  Theorem  2Qj  let  he  defined 

In  (3.2).  Then  as  T-»  00  /T  (®!-p  (D)  has  a limiting  normal  distri- 

Y 2 3 0.4 

bution  with  parameters  0 and  ( 1-0!  ) / ( 1-K2  ) 
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4.  ESTIMATION  BASED  ON  THE  FINITE  AUTOREGRESSIVE  APPROXIMATION. 
ASYMPTOTIC  THEORY  WHEN  THE  ORDER  IS  FIXED 

4.1  Introduction. 

Durbin  (1959)  proposed  an  estimation  procedure  for  the  parameters 
of  (l.l)  that  we  here  analyze  for  the  simplest  case  of  q = 1. 

As  seen  in  Section  1.2*  if  we  want  an  exact  representation  of  (l*7) 
of  the  autoregressive  type  we  can  choose  between  (l.l?)  whose  residuals 
are  correlated*  and  (l.27)  where  the  order  of  the  autoregression  is 
infinite.  Doirbiu's  idea  is  to  use  instead  an  approximation  of  the  form 


(4.1) 


it. 


where  f3Q  = the  u/ s are  assumed  uncorrelated  with  zero  means  and 
constant  variance*  and  the  order  k is  assumed  large  enough  to  make  the 
approximation  useful  for  the  purposes  of  estimation.  The  choice  of  k 
turns  out  to  be  a major  theoretical  and  practical  issue*  but  we  post- 
pone its  discussion  until  later. 

The  first  stage  of  Durbin*  s proposal  consists  in  estimating  the 

(3.  in  (4.1)  by  ordinary  least  squares.  If  we  denote 
1 


(4.2) 


y+ 


\yt-(k-l)> 


equation  (4.l)  leads  to 
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(4-3) 


yt ■=■ - y+=1  £ + ut  » t = k+i^ ...,t  , 


and  the  normal  estimating  equations  are 


(4.4) 


T 

1 


t=k+l 


It  -l^t 


[+  L 2t-ix;~! 

t=k+l 


If  we  introduce  the  notation 


(4.5) 


% ' T | 2t-iXt-l  9 ®T  T* 

, t=k+l  t=krM 


Y-i-  1^4- 
^0  _L  X> 


where  is  of  order  k x k and  m^  is  of  order  k x 1,  the.  solution 

of  (4.4)  is 


(4.6) 


It 


„-i 

Mrp  Srp 


The  k x k matrix  £t~lXt-i  = ^t-iyt-j  ^ ig  °f  rank  1 (every 

minor  of  order  2 is  o).  However  the  matrix  Z y , y!  , where  the  sum 

is  over  at  least  k values  of  is  positive  definite  with  probability 

one:  the  condition  for  linear  dependence  among  columns  is  that  there 

exist  c.’s^  not  all  equal  to  zero.,  such  that 
<3 


0 = L C1  z yt-ryt-i,=  E yt_i  I Vt-i  ' 1 = 

j=l  J t ■ 0 * c 3 t T 1 J=l  J r J 


>k 


and  the  probability  is  0 that  the  same  linear  combination  of  the  y * s 

"U 

is  0.  Since  in  our  asymptotic  arguments  T is  large  compared  with  k. 
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M™  defined,  in  (4.5).  is  positive  definite^  and  hence . nonsingular ,,  with 
probability  1. 

It  will  be  proved  in  Lemma  4.5  that  plim^^  M^,  = Z = ( cr^( i-j ) )s 

for  each  fixed  kj  that  is.,  estimates  consistently,  the  covariance 

matrix  of  a. segment  ( y } . . . } y ) sampled  from  (1.7).  The  components 

of  M^,  and  m^  are  slightly  different  from  the  sample  autocovariances 

defined  in  Sections  1.4.1  and  2.1,  all  being  based  in  T-k  terms.  Durbin 

[(1959).  P-  312]  also  considered  using  c 's  to  estimate  the  as 

J I 

will  be  discussed  in  Chapter  5 • 


Let  £m  = 0> 


IT  2T' 


. ..^b  )!.  Then  Durbin's  final  estimator  for 


kT 


a is 


(4.7) 


k-1 


QI  = 
T 


i=0 


iT  1+1,  T 


k-1 


l »' 


i=0 


IT 


where  bQT  = 1.  To  preserve  some  symmetry  we  let  the  sum  in  the  denominator 

of  (4.7).  Include  terms  only  up  to  k-'l*  as  in  the  numerator.?  while  it 

2 

could  also  include  b^i  for  k moderately  large  and  as  T.  — > <»/  the 
difference  between  the  two  possibilities  will  be  very  small. 

Durbin's  argument  to  pass  from  (4.6).  to  (4.7)  Is  based,  on  approxi- 
mating the  joint  distribution  of  the  b "s,  introducing  the  parameter 
a by  equating  the  covariances  with  those  of  the  moving-average 

models  and  then  looking  for  the  maximum  likelihood  estimator  of  a.  From 
our  point  of  view  we  take  (4.7)  and  (4.6)  as  defining  the  estimator,  and 
try  to  derive  its  asymptotic  properties. 

Durbin  argued  that  provided  one  can  choose  k as  needed,  ■ the 


estimator  would  be  consistent  and  achieve  asymptotically 


(4.8) 


Var(aT)  (l -a2)  , 


which  is  Whittle's  [(1954)*  p.  432]  evaluation  of  the  minimum  asymptotic 
variance  of  consistent  estimators  of  a.  Our  main  efforts  are. directed 
towards  giving  detailed  proofs  of  these  assertions*  and  trying  to  treat 
k formally. 

Note  that  if  in  (4.7)  b^  is  replaced  by  (-a)1*  then  (4.7) 

becomes  equal  to  cu  This  provides  an  interpretation  of  Durbin's  final 

form  of  the  estimator.  The  interpretation  is  based  on  the  fact  that 

if  the  u are  considered  to  approximate  the  e of  (1.17.)*  then  p. 

t t j 

is  approximately  equal  to  (-ctf  and  hence  b^T  approximately  estimates 

(-of.  The  approximation  is  'a  priori'  very  good*  in  the  sense  that  up 

to  second-order  moments  Var  (e*  ) differs  from  a constant  by  a factor 

Pk+P 

(l +a  )*  which  tends  to.:one  very  fast  [cfi  (1.20)].  But  note  that  if 

in  (4.1)  we  substitute  directly  p.  = (-of*  we  will  not  obtain  a simple 
e stimating  procedure  for  Ct*9  in  fact  we  will  then  be  led  to  equations 
similar  to  (I.36)  in  level  of  complexity. 

One  attraction  of  Durbin's  proposal  is  that  both  stages  are  based  on 
linear  operations.  There  exists  then  a good  motivation  to  investigate 
some  of  the  details  of  the  method.  Many  of  the  known  estimation  procedures 
are  also  two-staged*  but  are  computationally  more  complicated. 


We.  now  consider  the  evaluation  of  plim^^  Qi^  when  k is  regarded 
as  fixed*  not  changing  with  T.  In  this  section  we  treat  the  case  q=l. 


Theorem  4.1 . Let  y satisfy  equation  (l°7)  for 


■>“1,0,1, 


where  0 < |a|  < 1 and  the  are  independent,  normal  with  ~ 


' ef  = ff2  ( 0 < cr2  < oo ) for  all  t . 


k is  chosen  satisfying 


k > 1,  ■ and  that;  a set  of  observations  of  {y. } at  time 


t - 1,2,  . . . , T 


is  available,  where  T > k+1.  Then  for  a defined  by  (4.7)  we  have 


piin^ 


( l°a2k) ( i-ro2kfii' ) - ka2k(l~cA) 


°T  ” a 


•9) 


2 k+l  , 2 \ 

= a + a (i-a  ) 


2ka2k+2(  l~a2 ) 


or  jl~a2ii)  - k(i-a2) 


( l-ar  ) ( i+ac 


2kh2k+2( i-a2) 


To  prove  this  assertion  we  present  three  lemmas,  but  first  observe  that 
(1.12)  implies  that 


(4.10) 


/ 


2 = 


6 


,y , y ' 

*0~c = lot  “ 1 


= CT 


? 

1+ct 

Q 

a 

14Ct 

o 

. 

® 

• 

e 

• 

‘0 

0 

..  0 

..  0 

® r 

. . i+ac 


= rj  p 


(4.11) 


^Xf=lyt  0 


43 


Lemma  4.2.  Let  {z*}  be  a sequence  of  random  variables  and  let  m be 

“0 


a fixed  positive  integer.  If  each  of  the  subsequences  { z j+sm"  3 - ©•>  1; 


sequence  { z* ) does  too . 


Lemma  4.3  • Under  the  assumptions  of  Theorem  4.1j 


(4.12)  plim, 


t t=|+1  yt-iyt-j 


j — O ji  o • « ^ k 


The  proofs  of  Lemmas  4.2  and  4*3  constitute  Section  8.1. 

Lemma  4.4.  Under  the  assumptions  of  Theorem  4.1  the  vector  P ^q  has 
components 


(4.13) 


. -5  _,2k-2j+2 


1-Ct 


,2k+2 


J “ 1;  2^  « • i ^ k 


Proof.  Shaman  (1968)  shows  that  if  E x = (0'^)  is  of  order  k x k>  then 


(4.14) 


xj 


cr2(l=a2)(l=a2k+2) 


j > i 


-1  2-1  2-1  -I 

Nows  P £ = a Z g.=  a.  a Z e.  Hence,  the  components  of  P £ are 

2 -X 

Q!  0 ' times  those  in  the  first  row  of  Z [i=l  in  (4.14)],  which  proves 

the  lemma.  Q.E.D. 

Proof  of  Theorem  4.1.  Using  the  notation  introduced,  in  (4.10)  and  (4.1l). 

from  Lemma  4.3  we  conclude  that  plim^^  = Z = cr  P>  and  plim^ 

2 -1 
a £.  Since  M^,  is  of  order  k x k.?  the  components  of  are 


1—5.00 
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continuous  functions  of  the  components  of  that  do  not  involve  sums 

of  order  T of  those  components.  Hence 


p11”^  Bj;1  = = s'1  = U/o2)  p'1 


We  then  have  that 


pu>y».  §t  * - (l/°2)  e"1  °£  s - - e’1  s ' 

whose  components  are  evaluated  in  Lemma  4.4.  Substitution  in  (4.7) 
gives  the  desired  answer.  The  details  are  in  Section  8.2. 

Note:  When  is  defined  with  the  denominator  in  (4.7)  equal  to  2_  ,, 

expression  (4.9)  becomes 


(4.15) 


a 


(l-q2k)(l-K32k'4' 
d^2k+2)(l^2k+' 


K»2k(l-a^ 

2(  k+l)a2k*2( l~a2 ) 


= a + 


a2k+1(  i- 


■a2) 


(1-a 


a2( i-a2k+2 ) - ( k+1) ( l^a2) 
2k+2)(l+a2k+4)  - 2(k+l)a2k+2( l-a2) 


To  illustrate  the  importance  of  the  factor  of  Ot  in  the  first  line 
of  ( 4 .9) ) we  present  the  results  of  Table  4.1.  It  shows  the  values  of 
(l/a)  plim^^^  ■ S;  ^ for  several  combinations  of  values  of  a and  k. 
Note  that  the  factor  approaches  1 when  0!  -»  0 (for  given  k)^  while 

p 

it  approaches  2(  k+l)(  k+2 )/  ( 2k  + 9k+13)  when  a ->  1 (by  L*  Hospital’s 
rule)j  the  corresponding  limit  for  (4.15)  is  2k/ ( 2k-i”3 ) . 
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Table  4.1 


Factors,  of 


k 

_d 

A 

1 

0 99009900 

.96153846 

.91743119 

.86206896 

.80000000 

2 

.99980396 

•99704788 

.98649889 

.96295530 

•92421441 

3 

.99999705 

.99982313 

.99819235 

.99135347 

.97347960 

4 

.99999996 

.99999056 

.99978313 

.99816190 

.99130898 

5 

.99999999 

.99999952 

.99997559 

.99963222 

.99729158 

6 

.99999999 

.99999997 

.99999736 

.99992932 

.99918682 

7 

.99999999 

.99999999 

.99999972 

.99998679 

.99976248 

8 

1.00000000 

.99999999 

.99999997 

.99999758 

.99993204 

9 

1.00000000 

.99999999 

.99999999 

.99999956 

.99998086 

10 

1.00000000 

.99999999 

.99999999 

.99999992 

.99999468 

20 

1.00000000 

1.00000000 

1.00000000 

1.00000000 

.99999999 

30 

1.00000000 

1.00000000 

1.00000000 

1.00000000 

1.00000000 

k 

.6 

A 

A 

A 

1 

.73529411 

.67114093 

.60975609 

.55248618 

2 

.87197977 

.81055427 

.74481457 

.67882587 

3 

.93979090 

.88987126 

.82776960 

.75932069 

4 

.97256174 

.93602243 

.88136964 

.81394940 

5 

.98787621 

.96309708 

.91728999 

.85290284 

6 

.99478831 

.97893882 

.94195785 

.88175021 

7 

.99781092 

.98812350 

.95916141 

.90374524 

8 

.99909779 

.99338314 

.97126611 

.92090072 

9 

.99963384 

.99635518 

.97981879 

.93452281 

10 

.99985324 

.99801290 

.98586703 

.94549381 

20 

.99999998 

.99999676 

.99967463 

.99018094 

30 

.99999999 

.99999999 

.99999426 

.99824687 

4o 

1.00000000 

.99999999 

.9999999 1 

.99971094 

50 

1.00000000 

.99999999 

.99999999 

.99995534 

60 

1.00000000 

1.00000000 

.99999999 

.99999340 

70 

1.00000000 

1.00000000 

.99999999 

.99999905 

80 

1.00000000 

1.00000000 

1.00000000 

> .999999 86 

90 

1.00000000 

1.00000000 

1.00000000 

.99999998 

100 

1.00000000 

1.00000000 

1.00000000 

•99999999 

150 

1.00000000 

1.00000000 

1.00000000 

.99999999 

200 

1.00000000 

1.00000000 

1.00000000 

1.00000000 
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From  the  result  of  Theorem  4.1  it  is  easy  to  derive  an  asymptoti 


ic 


expansion  for  plim  G 


T 


Under  the  assumptions  of  Theorem  A. 1 we  have  that 


2k+l, , 2xr_.  4 


2_4ks 


(4.16)  plim^^  aT  = a + of— x(l-a^)[o;‘+-k(l-a£:)]  + 0(^0**)  , 


where  by  definition 


.(4..17) 


I 0(y)  I < My 


for  all  y>0  and  fixed  M > 0. 

The  proof  of  Corollary  4.5  is  in  Section  8.3. 

For  (4.15)  the  probability  limit  as  T -» <»  can  be  written  as 

(4.18)  a + a2k+1(  1-a2 ) [ ( 2a2-i)  -k(i-a2)]  + o(k2a4k)  . 


4.3 


When  the  Sample  Size  Increases . 


Let  us  define  the  expression  in  (4.15)  as  0:*  that  i 


is. 


(4.19) 


a*  = pli^  aT  , 


where  0!  is  defined  by 


k-1 


I b^b. 


(4.20) 


a 1=o  'iT 

ttT  “ k 


1 4 


i=0 


The  inclusion  of  b in  the  denominator  will  simplify  some  of  the 


calculations 
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oo 


Theorem  4.6.  Under  the  assumptions  of  Theorem  4.1*  ■ let  be  defined 


equal  to  ( 4.1g ) . Then*  as  T — £ OO  p /t  (aT-a*) 


In  (4.20)  and  a* 
has  a limiting  normal  distribution  with  parameters  0 and 


2a2kl~gl  no2)-f  if  inA:+l;  ) 


l-a2k+^)(  i+a2k+  ) -2(  k+l)^^  i-a 


.2k+2 / n _2T2| 


(4.21) 


2a2k+2  ( l+a2 ) - ( ifq2k+2 ) ( i-a2k+l4, ) 

[( i-a2k+2 ) ( i+a2k+4 ) -2(  k+l)a2k+2(  i-a2 )]‘ 


2 U-Q2)2 


a 


(a2k+2El  + dktV  + a6k+6Bj  , 

-L  d $ 


where  B0*  and  are  functions  of  a and  k written  in  full  In 

Section  8.4.3. 


Proof.  Since  all  needed  results  are  homogeneous  of  degree  0 in  a*  we 
2 

take  a = 1 without  loss  of  generality. 

The  proof  of  the  theorem  will  be  done  in  several  parts*  as  follows. 

Part  1.  [Asymptotic  normality  of  /t  (p  - p*)]. 

Let 


(4*22)  £*  = PlinW  It  = - E"1^  > 

with  components  given  by  the  negative  of  (4.13). 

First  we  want  to  show  that  /t  (|T-!f)  has  the  same  limiting 
distribution  as  - /t  >z"~  (j^  + M^B*).  The  details  of  this  are  given 
in  Section  8.4.1. 
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Next: 


+ Mj £*)  = 2 » an<3.  JSj  + iSjr  P*  h.as  components 


, T k 

J-  v r o* 


I [ p:  y+_4  y. 


t=k+l  h=0 


= 1 

h Jt-h  T 


k 


t=k+l  h=0 


t-i  h t-h 


p£  y+ 


.23) 


T-i 

I 


s=k+l-i 


k 

T P*  y v 

^h  J s " s+i  -h 


k 


I P 

h=0 


* 1 
h T 


T-i 

I 


t=k+l-i 


yt 


yt+i-h 


i • — • 

These  random  variables  have  the  same  structure  as  those,  in  equation  (2ob-3) . 
By  the  argument  given  in  Section  7«3»3  it  follows  that  for  fixed  i the 
random  variables  £ .^=Q  P*  y^  are  finitely  dependent  of  order 

k+l>  which  is.  now  a fixed  number.  By  the  Central  Limit  Theorem  for 
finitely  dependent  random  variables  [see  for  example  Anderson  (l971a)> 
Theorem  7 *7  «5  L as  T -» <»  ■ the  random  vector  /t  (m^  + M^.  p*)  has  a 
limiting  normal  distribution  with  parameters  0 and 

(4.24)  E = uv>»  T £*)'  > 


and  hence  /t  (P^-g*)  has  a limiting  normal  distribution  with  parameters 
0 and 


(4.25) 
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Fart  2.  .[Asymptotic  covariance  matrix,  of  /t  (n^  + M^,  (3*)] 
The  components  of  (4.24)  are 


(4.26)  I I p£pJ.  ^(yt-iyt-hyB-/s-ii4'  1^1'3<5k- 

Sj,t=k+1  h,h  =0  0 

In  Section  8.4.2  it  is  proved  in  detail  that  the  components  f . . of  the 
matrix  F defined. in  (4.24)  are  given  by 


(4.27) 


f . . = f . . + f.  .0  , 

ij  ijl  ij2 


where 


f „n  = i + a2  + a2k+2  , . i=j  , 

(l-a2k+2)2 


(4.28) 


_ a + 2k+2  q(l-Qf,) 

* i-a2k+2  ' 


i-j |=1 


= 0 


otherwise  } 


f „ ,(.a\k+2 

fi)2  - 2(  > 1<;2k+2  ' 


i+j=k 


(4.29) 


1 / 1 \fi  /~y2k^"4  1 

= 2a(  -a)k  ■ ) 

(1-Cc2k+2) 


i+j=k+l  ) 


otherwise 
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Part  3 ♦ [Asymptotic  distribution  of  /t  (ct^-a*)]. 

From  (4.20)  it  follows  that  & is  a continuous  function  of  the 
components  of  ^ . Noting  that 


( 4 °3o) 


a*  = 


k-l 

I p*p*+1 

i=0 


k 

i 

i=0 


3*2 


s 


[see  formula  (8.3)]*  from  a standard  result  in  asymptotic  statistical 
theory  it  follows  that  /t  (a  - a*)  has  a limiting  normal  distribution 
with  parameters  0 and 


(4.31) 


v = 


k 

l 

i;  j=l 


da*  da* 
dp*  dp*  * 


where  the  h„  . are  the  components  of  H defined  in  (4.25)  [See  e.g. 

1 J 

Rao  (1965);  Section  6a.2].  Hence  it  remains  to  show  that  V defined 
in  (4.31)  agrees  with  (4.21)$  this  is  done  in  detail  in  Section  8.4.3. 
We  now  derive  an  asymptotic  expression  for  v. 


Corollary  4.7”  Under  the  conditions  of  Theorem  4.6j  the  variance  of  the 
limiting  distribution  of  /t  (a^  - a*)  is 

(4.32)  v = (1-a2)  |l-a2k[i-8a2+i4a^-8ka2(i=a2)]|  + ( i-a2)^°^a2k+o(a2k)  , 

where  ''  is  (8.70)  of  Section  8.4.3  with  a2k  replaced  by  0 . 

The  proof  is  in  Section  8.5. 

By  rearranging  its  terms  bZ^  can  be  written  as 
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(4.33) 


3 


= I P,kJ 

3=0  J 

for  some  coefficients  p.  that  are  functions  only  of  a.  We  omit  these 
details  here. 

4.4  Behavior  of  the  Parameters  of  the  Asymptotic  Distributions  When 
the  Order  of  the  Approximating  Autoregression  Increases . 

One  way  to  interpret  the  proposal  studied  in  the  previous  sections  is’ 

that  for  sufficiently  large  samples  (so  that  the  limiting  distribution  as 

T ->  oo  is  a good  approximation),  by  suitable  choice  of  k one  obtains  an 

estimator  which  is  very  close  to  being  consistent  for  a,  and  whose 

2 

variance  is  very  close  to  ( 1-a  )/t.  Another  possible  approach  is  to 
state  k as  a function  of  T,  and  fix  the  rate  at  which  T dominates 
kj  this  was  done  in  Chapter  2 for  a method  of  estimating  the  serial 
correlations . 

In  terms  of  the  first  interpretation  mentioned  above,  it  is  relevant 
to  study  the  behavior  as  k -»  <»  of  the  limiting  distributions  obtained 
in  Section  4.3* 

Theorem  4.8.  Under  the  conditions  of  Theorem  4.1,  let  $*  be  as  in 

. " J 

(8.7)  and  F = (f . . ) as  in  (4.24).  Then,  for  fixed  j 
(4.34)  lirn^  p*  = (-a)J  , 

and  for  fixed  i and  j 
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(4.55) 


lia 


mk- 


->°°  ij 


+ a 


i=j  , 


= a 


1 i-j 1=1 


■ = 0 


otherwise  . 


Proof . Expressions  (8*7),  (4.28)  and  (4.29)  make  the  proof  immediate, 
because  |a|  < 1.  Q.E.D. 

These  results  can  be  interpreted  as  follows;  If  T . is  large 
enough?  and  k large  enough, . then  the  first  stage  of  the  estimation 
procedure  (approximately)  estimates  the  (-a)10  as  coefficients  of 
(4.1).  (see  also  the  discussion  in  Section  4.1),  and  the  covariance  matrix 

of  these  estimators  is  Z . If  a~  4 1*  then  the  covariance  matrix  is 

2 -1  -1  _2 
a Z . Since  Z = plim^^  ' for  fixed  k,  this  shows  that 

( approximately ) the  first  stage  works  as  a standard  regression  problem 

with  stochastic  regressors. 

These  results  were  mentioned  and  used  by  Durbin  [(1959),  page  307]. 

Theorem-'  4.9.  Under  the  conditions  of  Theorem  4.1,  let  a*  and  v be 
as  in  Theorem  4.6.  Then 

(4.36)  lim,  a*  = a , 

js-^oo  J 

(4.37)  lim,  v = l-ar  . 

k-^00 

Proof . The  forms.  (4.15)  and  (4.21)  make  the  proof  immediate.  Q.E.D. 

The  results  of  Theorems  4.8  and  4.9  can  be  arrived  at  in  a direct 
way,  by  redoing  the  proof  of  Theorem  4.6  and  discarding  readily  the  terms 
that  are  negligible  for  k large.  Durbin  [(1959)*  Section  4]  gives  a 
different  argument  to  this  effect. 
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The  ith  component  of  m^,  + (3*  is  given  by  (4.23).  Using 

to  mean  "asymptotically  equivalent  to"  (as  k -»«>);,  we  have  that 


k k Pk+?-?'h 

(4-38) 


Z (“a)h  te.  , - (-a)e+  = £+  - (-a) 


k+1 


h=0 


t-h  v ' t-h-lJ  t 


”t-(k+l) 


€t  ' 


IJI 

so  that  (4.23)  is  asymptotically  equivalent  to  (i/t)  /,  yi  . e,  « 

X — J£"'~  X u “ X o 

Instead  of  (8.l6)  we  avaluate  directly.,  using  y = e + Cte  } 

"0  u ~G  " X 


lxmT_5>00  T T 


E ^(ys-i  es  yt~,i  et}  = 


14a2  f 


s.t=k+l 


(4.39) 


= a 


= 0 


i=J  > 
i- j 1=1  > 
i~j I > 1 > 


which  is  the  same  result  as  (4.35)’  Hence  /?  (j3^-g*)  converges  in 
distribution  to  a normal  with  parameters  given  approximately  by  0 and 
2 \ and  in  (4.31) 


(4.4o) 


ij  5a*  5a* 

. . n J 8(3*  dp* 

1^0=1  x 


V Z 17  P,R* 


Now 
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(4.41) 
and  hence 

v ~ 


(4.42) 


g;— (i^H-o)^1,  & - tV'V^21)  , !<j, 

p„i  l-a 


l-ct2)  1 I 

a2  l -a2  i=i 


Y ( -a)1+t^(  -a)1”^  l-a2^ ) + Y (-o:)1+^(-a)^”:L(l~a2:L 


j=l 


j=i+l 


.2  s3  k 


a 1=1 


21|.  „2  l-a2l\  / _2i \ „2i+2  i-a2k“2:L 

a i-a  - — — +(i-a  )a  — - 


1-ct2 1 


l-a 


.,2  s3  k 


.211 


..2  s3  oo 


IkfL  I M.  ^ia2i 

a i=l  l-a  / a 1=1 


-a2 ) a2 


2 2 
a (l-a2) 


.=  l-a 
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5 . ESTIMATION  EASED  ON  TEE  FINITE  AUTOREGRESSIVE 

APPROXIMATION.  A MODIFIED  VERSION  OF  THE  ESTIMATOR. 

5.1  Introduction. 

The  asymptotic  theory  developed  in  Chapter  4 leads  us  to  consider 
the  two-staged  estimator  of  a proposed  by  Durbin*  as  one  that  is 
satisfactory  from  the  large-sample  theory  viewpoint.  However*  nothing 
has  been  said  about  its  small-sample  properties. 

In  his  original  paper  Durbin  (1959)  exhibited  as  illustration  a 
group  of  10  simulation  runs  with  T=100*  where  the  observations  were 
generated  by  model  ( 1.7)  with  a = 0.5.  The  resulting  estimates 
showed  a good  agreement  with  the  asymptotic  variance  (l-c^)/T  but 
their  average  differed  rather  seriously  from  0.5*  In  his  later  paper 
Walker  (1961)  tried  to  account  for  part  of  the  small-sample  bias*  but 
his  correction  is  complicated  and  not  completely  effective  from  a prac- 
tical point  of  view.  Hence  the  question  of  small-sample  bias  seems  an 
open  one. 

One  possible  way  to  improve  the  finite-sample  performance  is  to 
use  more  fully  the  structure  of  the  underlying  moving-average  model. 
This  can  be  done  is  a way  that  also  makes  the  computations  more  simple. 

The  idea  is  due  to  Anderson  ( 1971b)  and  consists  in  replacing  the 
first-stage  equation  (4.6)  by 

( 5 • 1 ) = Sqi  Sqi  > 

where 
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0 


wo® 


(5.2) 


^ = 


CGT  C1T 

coc 
IT  OT  IT 


0 0 0 


\ 


* * C0T ' 


» cm  = 
* ~T 


'IT 
1 2T 


'kT 


and  as  in  Chapter  2. 


(5-3) 


T“j 


' jT  T ytyt+j  ' 


Note  that  for  each  fixed  k. 


(5.4) 


Pli®T 


, Cm  = £ } 

oo  ~T  ~ 


plinu,  cm  = a q 
T-»°°  ~T  * 


The  basic  idea  is  to  replace  in  c 
in  fact  a (j)  = 0 if  j > 1 (see  (1.12)) 
estimate  1 consistently. 

If  we  now  write 


) then 


0 for  j > 1,  since 

both  Mm  and  0 
~T  ~T 


(5.5) 


0 0 


Is  the  matrix  of  those  sample  autocorrelations  that  do  not  estimate 
0;  and  we  have  that 


(5-6) 


~ 1-1  “1 
ft  — = nr.‘i"i  R e = - R r 


where  rT  = (r^,  ...,r^)f  as  in  Section  2.1. 
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-1 


The  components  w^^,  or  Sup  are  gi-ven  by 


(5-7) 


( 1-x' 


2k-23+2)(xj+1+1_xj.1+1) 


w 


ijT 


IT 


IT 


r_(l-x?J(l-x^+2) 


' IT  IT 


i < j 


where 


(5-8) 


■1+ 


/l-hr' 


IT 


IT 


2r 


IT 


see  e.g.  Mentz  [(1972),  Chapter  3J*  Hence  [3^  has  components 


(5-9) 


IT 


^ £ ”ijT  °JT  ' - £ "ijT  rjT  ’ 1 * ’ 


and  the  final  estimator  of  a is  now 


(5-10) 


Q, 


k-1  k k 

£ rJT)(,i=i  Wi+1^T  ^ 


T 


k k 2 

T-  ^ Z wiiT  riT^ 

i=0  j=l  J J 


This  estimator  is  easier  to  calculate  than  that  of  Chapter  4,  because 

A 

instead  of  having  to  solve  the  system  Mrpj3|j.  = -m^  in  the  first  stage.?  we 
have  the  explicit  form  (5»10)j  this  of  course  reflects  the  fact  that  we 
know  explicitly  the  components  of  C^*.  The  large-sample  properties  of 
aT  will  be  investigated  mathematically  below.  The  small-sample  perfor- 
mance can  be  studied  through  simulations,  but  we  will  not  include  them 
in  this  work. 
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As  was  noted  in  Section  4.1  Durbin  [(1959)>  p.  312] . suggested  as  an 

alternative  to  (4.6),,  and  hence  to  (5-6)  above.,  to  estimate  the  p.'s 

J 

v„]_  v 

of  the  approximating  autoregression  by  -gT  r^  where  has  components 

V 


for  1 < 1,  j < k. 


rijT  = r | i-j  | j,T 
Chapter  corresponds  to  letting  r 


ijT 


Clearly  the  proposal  studied  in  this 
= 0 for  | i-j | >1. 


From  the  proof  of  Theorem  4.1  and  the  fact  that  (5 .4) . holds;,  we 
see  that 

(5- •3-1)  plim^1_^o  = cc ' s 

where  a is  given  by  (4.15)  or  (4.l8)$.  it  would  be  given  by  (4.9) 
or  (4.l6)  if  the  sum  on  i in  the  denominator  of  (5.10)  reached  k-1 
instead. of  k.  Hence  0;  is  also  an  inconsistent  estimator  of  a,. 

To  find  the  asymptotic  distribution  we  note  that  the  same  steps 
of  the  proof  of  Theorem  4.6  can  be  used.  In  fact 


(5.12) 


and 


i1**/  _v_ 

/t  (pt=  *n 


(5-13) 


/t  Z (c  + CmP*) 


have  the  same  limiting  distribution  as  T ->  oot  by  the  same  arguments 


used  in  going  from  (8.10)  to  (8.13).  The  vector  c^+C^  has  components 
(5-14) 


C1T  + P2°1T  + P1C0T  * CkT  + Pk-1C1T  + PkC0T  ' 


and 
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(5.15) 


CiT+(^i~l  + Pi+1^C1T  + picOT  * 


2,5; 


>,k-l 


which,  are  of  the  form 


(5.16) 


k * !T-h 

^ih  T ^t^t+h  ■* 
h=0  . 1=1 


These  random  variables  have  the  same  structure  as  those  of  equation  (2.45), 
considered  in  Section  7*5*55  by  the  argument  presented  there,  it  follows 
that  for  fixed  i the  random  variables  ]F  ^-q  /^y^y^.^  are  finitely 
dependents  of  order  k+1,  which  is  now  a fixed  number.  By  the  Central 
Limit  Theorem  for  finitely  dependent  random  variables  /t"  ( c^+C^g'*)  has 
a limiting  normal  distribution,  and  so  does  -/T  Z (c^+C^g  ;. 

¥e  have  to  find  the  variances  and  covariances  of  the  limiting 
distributions.  Let 

(5*17)  u = (u^Ug,  ...,u^)‘  =/T  (c^g*)  5 


then  fu.  = 0 and  we  need  ^u.u.  = Cov(u„,u.)  for  i,j  = 1,2, ...,k. 

To  avoid  lengthy  algebraic  details  as  those  of  Chapter  4,  we  shall 

only  consider  the  evaluation  of  the  variances  and  covariances  of  the 

limiting  distributions  as  T -»«>,  omitting  factors  and  terms  like  0^, 

kCt  , etc.  that  tend  to  0 as  k -»  proceeding  as  we  did  at  the  end 

of  Section  4.4.  In  particular  we  take  (5-15)  as  including  i=k,  because 

the  addition  of  fr",  ncnm  ~ (-a)  c_,m  to  u.  will:,  not  affect  the 

k+1  IT  IT  k 

2 

necessary  values.  Hence  we  need  the  limits  as  T -»■<»  of  ^ u, , |u  u, 
and  ^u.u.  for  i, j=2,5, •• *,k,  where  u is  defined  in  (5*l4)  and  u. 
in  ( 5 • 15 ) • For  i, j > 2 we  have  that 
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;UiUj  T ^tciTCjT+^j-l+^j+l^CiTClT^Pi-l+^i+l^CjT^lT+^jCiTCOT 


(5-18) 


+ [p*(B*  _+B*  )+B^(B*  +6*  )TC  c +B%*c^  } 

Lpjvpi-1  pi+lJ  pvpj-l  pj+l'JC0T  IT  PiPj  0TJ 


,r>.* 


2 


Since  gu.  =0  we  can  evaluate 


T(  f =1To  .T-50iT{cJT)  ~ 5 I | ^l%*j-'ir(l)'j(i) 

S— X u — ± 

1 Tfi  T-j 

= T s|x  4 ^o¥t=s?t-s*j-'y(l)5y(i) 


(5.19) 


1 T-i  T-j 

f Z X ?,r(  i ) cr,r(  j ) +cr,r  (i$*£)or,r(  t - s+ j - i ) +a,r(  t - s+ j ) cr J t - s - i ) 


s=l  t=l 


y y y 


y 


-0  ( i)cr  ( j ) 

y y 

■L  T-i  T-j 

= m I I 0 (t-s)cr  (t-s+j-i)+erjt-s+j)erjt-s-i)  . 

x -1  1 -»  «y  j 


S=1  t=l 


y 


y 


Since  the  covariances  vanish  for  lags  exceeding  one  in  absolute 
value,  the  first  summand  will  contribute  only  when  t-s  = -1,0  or  1$ 
in  the  second  summand  t-s+j  and  t-s-i  must  also  be  one  of  these 
three  values;  this  determines  contributions  only  for  i,j  = 0,1,2,  and 
in  terms  limits  t-s  to  be  -1,0  or  1=  Hence  (5-19) /tends  as  T 
to  the  sum  of  the  contributions  listed  in  expression  (7.27).  Then 
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y‘1*1i^„T(f=1T=;|T-f=1TS=JT)=2(i+ta2404), 

i=j=G^ 

2 4 

= l+5a  +0:  } 

2 4 

=l+4a  +a  , 

i=  j=2>  0 0 . k^ 

(5*20)  =4a(l^a2)  , 

i=0,j=l  or  i=l.j>  j=0> 

CV! 

II 

i=O^0=2  or  ±=2.,j=0} 

=2a(i-fo2)  , i 

i“j  1=1#  (i^j)^(O,l)s(l,0) 

=a2  , I 

i~jl=2;  ( i^ j )^(0^2), (2,0) 

=0 

li-jl  > 2. 

These  values  can  be  checked  with  an  expression  in  terms  of  the  spectral 

r it 

density  function  defined  in  (1.28).,  because  (5*20)  equals  4jt  J cos(vi) 
cos(vj)  f^(v)  dv.,  and  for  the  case  of  q = 1,  f (v)  = ( a2/2x)(  l+a2+2Q  cos  v) . 

y y 

See.,  for  example.,  Anderson  [( 1971a) > Sections  7*5*2  and  8.4.2]. 

Substituting  in  (5*18)  the  values  derived  in  (5*20),  we  can  evaluate 

~ ai i’  say‘ 

Wow:  The  covariance  matrix  of  the  limiting  normal  distribution  of  (5*15) 

is  given  for  large  k approximately  by 


(5*21)  £”1(aij)  S”1  ' 


whose  components  are 

k ..  . „ 

( 5 *22 ) o agt0  ^ } i^j=l,  2^...,k  • 

s*t=l 

Let  V be  the  variance  of  the  limiting  normal  distribution  as 
T -» oo  of  /T  (cc^-o;  )s  where  is  defined  in  (5*10)  and  Q'  in 

(4.15 )s  and  we  operate  in  the  manner  specified  earlier  in  this  section. 
As  in  (4.31)  v is  given  by 
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(5-23) 


v 


r1  ~ So:*  Set'' 

i,j=i s&  sp-; 

— j 

(v 

where  h is  .given  approximately  by  (5.22). 

r±=> 

We  then  have  that  as  k oo,,  y approaches 


(5.24) 


(l=o:)  + a 


6 16+90^+70:^  , 12 


( i=qt  y 


The  mathematical  details  are  given  in  Chapter 

We  summarize  the  main  results  obtained  so  far  as  follows . 

Lemma  5.1.  Under  the  conditions  of  Theorem  4.1  the  covariances  of  the 
j^^££Ji2™aL^ributlo  /T  (Cqj-^O)),  /T  (c1T-ay(l)),  /t 
• » .j>  /T  ekT  are  given  by  ($.20). 

Proof.  For  a general  linear  process  the  asymptotic  normality  is  proved^ 
for  example.?  in  Anderson  [ ( 197 ) > Section  8.4.2] . This  result  merely 
specializes  that  to  the  moving  average  model.  Q.E.D. 

Theorem  5 »2 . Under  the  conditions  of  Theorem  4.1  let  j3^  be  defined 


_v. 

P-liV^co  ®T  = £ in 


f™.  / ^ ■■  -y. . 

Further  7T  (p  -p  ) 

" V f*J\\ 


has  a limiting  normal  distribution  with  parameters  0 and  if*  and 

^.v_ 

for  large  k}  H is  given  approximately  by  (5.2l). 


Theorem  5 .5 . Under  the  conditions  of  Theorem  4.1  let  be  .defined 


plinVp 


’t^oo  aT  a sixgs, in 


Further  /t  (o^-a  ) has  a limiting  normal  distribution  with  parameters 


0 and  V s and  lin> 


V given. 
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<V'y. 

The  actual  determination  of  the  exact  values  of  H*  and  V in 
the  previous  two  theorems  can  be  done  as  in  Chapter  4;,  but  we  omit 
those  details  here. 

5.3  Other  Variants  of  the  Proposal. 

After  the  work  of  the  previous  two  sections  was  completed^  the 
publication  of  a paper  by  McClave  (1973)  directed  our,  interest  to  some 
variants  of  the  estimation  procedure  described  in  Section  5 These 
variants  will  be  analyzed  briefly  here. 

McClave  (1973)  studies  empirically  three  modifications  of  Durbin’s 
proposal  described  in  Chapter  4^  with  the  desire  to  control  the  small- 
sample  bias.  In  our  notation  they  consist  of  the  following  things: 

(i)  To  let  the  sum  in  the  numerator  and  denominator  of  (4-7)  to  range 
only  over  0 < i < n^~l>  for  some  integer  n.^  (n1  < k)  to  be 
chosen  simultaneously  with  k. 

(ii)  To  replace  ( l/T)  £ t=k+lyt-iyt-i+2  by  0 in  ^ and  m T defined 
in  (4.5). 

(iii)  To  replace  (l/T)  £ t=k+lyt-iyt-i+h  by  0 in  ~T  and  .V  f°r 

h = n2+l<,  n^+2^  . . where  n2  is  an  integer  (2  < n2  < k)  to 

be  chosen  simultaneously  with  k. 

In  these  terms  the  proposal  defined  in  Section  5«1  corresponds  to 
case  (iii)  with  n£  = ~L}  except  that  the  sample  quantities  are  set  equal 
to  their  probability  limits  in  and  in  (The  difference  between 

the  sample  quantities  in  M and  in  C is  minor.*  as  was  noted  above). 
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Unfortunately  for  us  McClave  does  not  publish  numerical  results  for 
n2  = 1. 

The  paper  under  study  presents  results  for  alternative  (ii)  when 
simultaneously  several  choices  of  n_L  as  in  (i)  are  made,  and  for 
( Hi)  when  remedy  (i)  is  also  used,  and  n1  = ng.  In  the  first  such 
case  the  resulting  procedure  is  effective  in  decreasing  the  bias  (for 
T.  = 100,  Q!  = 0«5,  5 < k < 10,  4 < n^  < 6),  but  "the  corresponding  variance 

increase  is  about  fourfold"  (p.  60l).  For  the  second  alternative  (for 

T = 100,  o;  = 0.3,  0.5  and  0-8,  5 40,  1 < n^  = n2  < 5 ) , the  bias 

is  also  decreased  but  as  n^  becomes  small  (i.e.,  more  sample  quantities 

are  set  equal  to  zero),  "the  increase  in  variance. . .becomes  more  signifi- 
cant as  | a]  increases"  (p.  603). 

It  is  clear  that  McClave' s proposals  could  be  easily  studied  as  in 
Sections  4.4  and  5*2,  and  also  as  in  Sections  4.2  and  4.3,  to  determine 
the  behavior  as  T -><».  From  a practical  point  of  view  proposals  (i) 
and  ( iii)  imply  the  choice  of  new  quantities  (n  , n , or  both)  to  be 
chosen  together  with  k,  and  clearly  the  resulting  procedures  are  less 
attractive  for  practical  use. 

We  now  consider  the  case  of  changing  the  procedure  of  Section  5.1 
be  replacing  c^T  by  0 for  j > 1,  also  in  £T  defined  in  (5.2). 

Let  % = (cit^0^  ’ * °>°)'  ■ > = and  defined  as  in 

(5-10)  with  r replaced  by  0 for  j > 1.  The  same  approach' of 
Section.  5 .2  can  be  used.  In  particular,  plim^^  = q as  before. 

Let  u^  be  the  i-th  component  of  u = /t  (c^  + C^p*).  Then 
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using  again  that  = 1,  Pt  , , ~ 0*  Hence 


k+1 


(5-26) 


llmT-^oo  Cov(ui,u.)  = llinT_>„^u1uJ 

= (-a)1+t5"2[a2(i+a2)2+(l+ai|)2]  , 


i-tj  — 1;  • • • ) k } 

which  is  the  component  a..,  of  a..,  introduced  in  (9*2).  Then  the 

ijl  ij 

variance  of  the  limiting  normal  distribution  (as  T -»°°);  calculated 
as  in  Section  5*2  for  k large,  is 

V~  I I 

i,j=l  dp.  dpt  s,t=l 
1 J 


(5.27) 


o p 2 ji  2 p 4 

a2(i-fcr)  +(l+a4)  (1-a2) 

2 2 

a a 


I (-a)1  f(-a)Vs 

i=l  s=l 


p p 2 1-2  p 4 

a ( l+a  ) +(Ho:  ) (l-a  ) 
a2 


a^  _ a2(  l+a2  ) +(  l~Hi^  ) 
a2  (l-a2)6  ( l-a2)2 


1 + q2  +la^~  +a6  + a6 
( l-a2)2 


This  is  the  asymptotic  variance  of  the  "analog”  or  moment  estimator 
defined  in  (1.34)  [cf.  Whittle  (1953);  P*  432].  The  connection  can 
be  ehecked  easily  because  for  j = 1 (5*7)  becomes 


(5.28) 


1-x 


w 


2k 

IT 


ilT 


/,  2 w 2k+2\ 

■'■’-I  rn(  m)  ( 4”X-.  „ ) 


/ 2+i  2-iv 
U1T  ‘hi  > ' 


1 — 1, 2, # • • , k 


1 IT  IT 


IT 
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0 


for  j > 1 in  (5 »10 ) we  have  that 


and  letting  r„m . = 
jT 


a = 
T 


k=l 

I 

i=0 


W T W T” 

ilT  IT  i+l,lT  IT 


T"  ('w  r Y 
1=0  1 ilT  IT' 


(5.29) 


k-1 


i;  <4*  - 


+i  1-i 


1=0 


IT  ' IT 

I (41  " 41)2 


xit  y 


i=0 

k-1 


-2i-2- 


x 


^ ^X1T  “ ^ 1+X1T  ^ + X1T  ^ 


IT  k 

I (x. 

1=0 


.21 

IT 


2 + x”2i) 
^ X1T  ; 


which  is  approximately  equal  to  -x1T  = (2r1T)”1  (l-  /l-4r^~),  for 
large  k . 

The  values  of  v and  v are  compared  in  Table  5.1  with  l-a2 
for  several  values  of  a. 


Table  5 .1 


a Q 

Values  of  vf  y and  1 -Q  for  different  o; 


a 

V 

V 

l-a2 

.1 

.990016 

I.030916 

.99 

.2 

.961088 

1.135488 

.96 

.923368 

1.356351 

• 91 

.4 

.923 ^20 

1.795849 

.84 

•5 

1.118489 

2.701388 

.75 

.6 

2.028235 

4.74o849 

.64 

• 7 

5.962541 

10.094951 

.51 

.8 

30.477959 

28.613550 

.36 

• 9 

362.098390 

149.482220 

.19 
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Hence  for  a wide  range  of  values  of  ct  setting  some  estimators 
equal  to  0 (their  probability  limit  as  T -» <»)  in  as  well  as 

in  results  in  an  increase  in  the  asymptotic  variances. 

It  is  apparent  that  theiwo  alternatives  are  highly  inefficient 
for  values  of  |a:|  close  to  1.  Since  in  McClave's  paper  it  is 
shown  that  his  proposals  were  in  general  effective  as  bias -reducing 

(V  — 

devices^  it  seems  safe  to  conjecture  that  a and  Qi^  considered 
in  this  Chapter  should  also  be , considered,  as  competitors  in  reducing 
the  small-sample  bias  of  the  proposal  in  Chapter  However j,  as  is 

often  the  case  in  time  series  estimation  problems there  is  a severe 
trade  off  between  bias  and  variance. 
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6.  GENERAL  COMMENTS 


6.1  Comments  About 


In  the  Introduction  and  Summary,  and  also  in  Chapter  1,  we  presented 
some  comments  about  the.  basic  proposals  considered  in  Chapters  2 and  4. 

At  the  beginning  or  end  of  the  preceeding  four  chapters  we  commented 
briefly  about  the  corresponding  estimation  procedures,  and  the  properties 
we  were  interested  in  proving.  We  did  not  discuss  in  any  detail  the 
contents  of  the  papers  by  Durbin  (1959)  and  Walker  (1961),  not  shall  we 
do  that  here. 

In  this  section  we  want  to  insert  some  additional  comments  stemming 


from  both  our  work  and  consideration  of  the  two  papers  referred  to  above. 


The  comments  will  be  given  jointly  for  the  proposals  considered  in 
Chapters  2 and  3,  and  4 and  5.?  since  it  will  become  apparent  that  there 
exist  ample  similarities  among  them.  We  shall  refer  only  to  the  case  of 
q.  = 1,  the  first-order  moving  average  model.  It  is  hoped  that  some  of 
these  comments  may  be  useful  for  further  studies  of  the  estimation  problems 
considered  here. 

a)  Interpretation  of  the  estimators  as  linear  combinations  of  sample 


py(1) 

(6.1) 


iuantlties.  From  Section  4.1  we  know  that  Walker's 
is  a linear  combination  of  sample  autocorrelations 


A 


k-1 

Jo  Vh  Vut 


k 

r1T  + I EW,(j"l) 

3=2 


estimator  of 
, since  (2.7),  is 


On  the  other  hand,  we  can  write  Durbin's  estimator  of  a given  in  (4.7) 
as 
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a- linear -combination . of  the  first  k sample  autoregressive  coefficients; 
where 


(6.3) 


ApQ)  = 


k-l 

I 


j=o 


9 


j — 0-j  1,?  • • • $ k-l  $ 


and  b^  = 1.  Note  however  that  in  general  i^(o)  4 1* 

The  m (j)  and  H^(j)  are  also  random  variables;  functions  of  the  y^’ 

b)  Behavior  of  the  sums  of  the  coefficients  of  the  linear  combinations. 

Having  noted  that  the  estimators  are  linear  combinations  of  sample 

statistics;  it  pays  to  consider  the  values  of  the  sums  of  the  coefficients. 

For  large  T and  k;  we  know  that  the  n^(j)  in  (6.1)  are  approximated 

by  the  m ( j ) introduced  in  (2.37);  which  in  turn  converge  to  (7.25). 

±;1 

Hence  for  large  T and  k; 

k-l  k-l 

(6.4)  I nu(j)  ~ I (-a)"1 

<5=0  j=0 

Similarly;  for  large  T and  k;  the  b in  (6.2)  and  (6.3)  are  approxi- 

o 1 

mated  by  (8.7);  and  that  in  turn  by  (-a)J.  Hence 


1 + • 1"C^ 
1 + 1 0 

1+Q! 


1+a 
( 1+a)' 


(6.5) 


k-l 

X iT(i)  = k_i 

0=0 


1-a  . 


I * 


j=0 


<5T 


For  positive  <X}  (6.4)  and  (6.5)  are  smaller  than  1;  and  for  negative 


a they  are  larger  than  1. 


We  showed  that  the  coefficients  are  the  appropriate  ones  that  lead 
to  the  desired  large-sample  results.  However,  it  might  be  possible  to 
change  them  slightly  to  correct  the  small-sample  downward  biases  for 
o:  > 0,  say,  without  affecting  significatively  the  small-  and  large-sample 
variances.  These  ideas  should  of  course  be  studied  mathematically  as  we 
did  in  Chapter  5,  and  also  empirically  through  Monte  Carlo  trials. 

c ) Asymptotic  behavior  of  first  sample  autocorrelation  and  autoregressive 

coefficients . We  discussed  in  Section  2.1  that  r_,m  estimates  p (l) 

y 

consistently,  no  matter  how  k is  chosen  (i.e.,  no  matter  how  many  sample 

autocorrelations  are  computed  simultaneously,  in  so  far  as  1 < k < T-l). 

Hence  Walker's  proposal  was  interpreted  as  trying  to  improve  the  asymptotic 

variance  of  a consistent  estimator. 

On  the  other  hand,  from  (8.7)  we  see  that  for  k fixed,  -b 

estimates  consistently  as  T -»  «>,  - p*  = a(  l-a2k)(  l-a2k+2)_1.  For  large 

k this  is  very  close  to  a,  but  for  the  special  case  of  k = 1 it  equals 
2-1 

a(l-ta  ) . This  is  correct  because  for  k = 1 we  are  estimating  the 

parameter  of  a first-order  autoregression  by  ordinary  least  squares,  and 

P — "1 

that  gives  a consistent  estimator  of  p (l),  which  equals  a(  l-tct'  ) 

y 

for  the  first-order  moving  average  model. 

The  situation  persists  for  all  other  sample  autocorrelations  and 
autoregressive  coefficients  that  enter  in  (6.1)  and  (6.2),  because 

PlimT^oo  rjT  = ° f°r  J ^ while  PlimT^oo  b3T  = (^)'j(l^2k+2"2j) 

2k+2  -1 

(!“&  ) > for  j = 1,2, ...,k.  One  implication  is  that  Walker's  proce- 

dure may  depend  less  heavily  upon  the  choice  of  k for  a wide  range  of 
values  of  a,  and  that  it  may  also  be  less  biased  for  small  samples.  The 
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latter  point  showed  up  to  a limited  extent  in  the  examples  presented  in 
the  two  original  papers } but  clearly  more  empirical  evidence  is  needed,, 
in  particular  about  Walker's  proposal  that  has  not  been  considered  to  any 
extent  in  this  connection. 

Note  that  /t  (r1T“P  (l)) 
with  variance 

(6.6)  1 - 3P2  + 4p4  = 


[from  (2.5)],  while  from  Theorem  (2.3)  it  follows  that  the  variance  of 

the  limiting  normal  distribution  of  /t  (p  -p  (l))  is  the  first  term  in 

-L  y 

the  last  line  of  (6.6). 

For  Durbin's  proposal,  /t  (p  -p*)  is  asymptotically  normal  with 

-1  -1  2 -1 
covariance  matrix  H = 2 F 2 , which  is  approximated  by  a 2 for 

k large.  Hence  the  variance  of  the  limiting  distribution  of  /t  (-b  -a) 

is  approximated  for  large  k by 


is  asymptotically  normally  distributed 


1 -3 


Q 

r 

l -tcf 


+ 4 


a 


1-tor 


i4 


1 + Q2  + 40!  + + Q!^ 

(l-ta2)4 

( 1-a2  )3  4a2  + q1|(  i-ta2 ) 

( l+a2)  (in«2) 


(6.7) 


2 11 


a a 


a2  (l-a2)(  l-a2k)  „ n _ fl^2 
a2(l-a2)(l-a2k+2 


2 

where  1-OL 

/f  (&T-a), 


is  approximately  the  variance  of  the  limiting  distribution  of 
for  large  k. 
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For  other  comments  about  these  points?  in  the  case  of  Durbin’s 
estimator.?  see  McClave  [(1973);,  Section  2]. 

d)  The  role  of  the  truncation  points.  In  Chapters  2 and  3 we  dealt  with 
k?  the  number  of  sample  autocorrelations.  In  both  cases  q < k < T-l 
for  a moving  average  of  order  q. 

In  the  original  papers  no  precise  directions  were  given  about  how 
to  choose  k in  an  empirical  situation.  The  modification  introduced  in 
Chapter  3 allows  for  an  easier  choice  of  k?  in  the  case  of  Walker' s 
proposal.  In  Mentz  (1972)  the  exact  forms  of  w^  and  w-^  entering  in 
(2.6)  are  given.?  so  that  one  can  easily  write  down  closed-form  expressions 
similar  to  (3«2)-(3.5)  for  the  exact  version  dealt  with  in  Chapter  2?  and 
then  prepare  a table  similar  to  Table  3.1. 

In  the  moving  average  model  the  dimension  of  the  minimal  sufficient 
statistic  is  T?  the  sample  size.  By  considering  k sample  quantities? 
where  k is  usually  thought  of  as  being  much  smaller  than  T [cf.  (2.33)]? 
one  is  omitting  a relevant  part  of  the  sample  information.  This  fact 
apparently  had  more  important  effects  on  small-sample  biases  than  on 
asymptotic  or  small-sample  variances.  In  fact  the  proposals?  in  particular 
that  of  Durbin  that  has  been  studied,  in  greater  detail?  seem  biased  but 
quite  efficient  for  most  relevant  sample  sizes. 

e)  Corrections  for  bias?  further  remarks.  In  the  case  of  Durbin's 
e stimator  attempts  at  correcting  small-sample  downwards  biases?  led  to 
important  increases  in  variances?  both  small-sample  [McClave?  ( 1973 ) ] and 
asymptotic  [cf.  (5«24)  and  (5*27)] • One  way  to  Interpret  this  fact  is 
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that  as  in  (&)  ahovG^  omission  of  pants  of  th.6  sufficient  statistic  led. 
to  losses  of:  information.  Some  justifications  about  why  would  the  modifi- 
cations  reduce  the  small-sample  biases  are  given  by  McClave  (1973)“ 

f ) Relations  with  maximum  likelihood  and  least  squares  estimation. 
Durbin's  (1959)  way  to  go  from  the  b^T  to  0^,  is  to  set  up.  a 

likelihood  function  on  the  basis  of  the  limiting  normal  distribution  of 
the  b.  . Similarly  Walker  (1961)  starts  by  considering  the  limiting 
normal  distribution  of  the  r^.  4n  "kkis  sense  the  proposals  tend  to 
approximate,  for  large  T,  the  maximum  likelihood  method  of  estimation. 

However,  both  authors  introduce  simplifications  to  make  the  mathema- 
tical details  easier.  In  terms  of  our  discussion  in  Section  1.4.3  they 
both  come  closer  to  the  least  squares  procedure,  the  Jacobian  being 
neglected.  Further  the  inverse  of  the  covariance  matrix  is  also  appro- 
ximated. These  approximations  have  no  relevance  for  asymptotic  theory, 
as  we  showed  above,  but  may  be  important  in  small  samples,  and  may  con- 
tribute at  least  partially,  to  explain  differences  between  them  and  the 
maximum  likelihood  estimates. 

g)  Robustness  to  changes  in  the  distribution  of  the  error  terms. 

The  main  part  of  the  theory  in  Durbin's  and  Walker's  papers,  and 

in  our  work,  has  relied  upon  the  assumption  of  normality  of  the  error 
terms,  the  e_ ^ in  (l.l)  or  (1.7)* 

There  have  been  so  far  no  attempts  at  investigating  the  robustness 
of  estimation  procedures  for  the  moving  average  model  in  general.  We 
may  speculate  about  how  well  might  the  presently-considered  procedures 
behave  in  small-samples  when  the  probability  distribution  of  the 
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departs  signif ieatively  from  normality;  in  relation  to  other  existing 
proposals;  some  of  them  listed  in  Section  1.4. 

The  procedures  in  Chapters  1 through  5 start  by  considering 
sample  quantities  and  by  looking  at  their  asymptotic  distributions. 

These  turn  out  to  be  normal;  a result  that  holds  for  a wide  class  of 
distributions  of  the  [see-,  for  example;  Anderson  ( 1971a)  s Sections 

5*5  and  5*7  -3?  and  the  comments  by  Durbin  (1959.).?  Section  6].  Some,  other 
results  from  normal  distribution  theory  are  used  throughout. 

Hence  one  is  inclined  to  believe  that  for  moderate-sized  samples  the 
proposals  might  tend  to  show  considerable  robustness  to  departures  from 
normality  in  the  distribution  of  the  e.,_ . It  would  be  relevant  to  have 

G 

available  some  information  about  this  point;  possibly  ti^rough  Monte  Carlo 
studies . 


6.2 


Estimation  in  Moving . Average  Models  of  Higher  Order. 


Our  derivations  in  the  present  work;  have  been  restricted  to  the 
first-order  moving  average.  We  want  to  comment  here  about  the  possible 
extension  of  the  methods  of  proof  to  moving  average  models  of  higher 
order.  These  were  considered  in  the  original  papers  by  Durbin  and  Walker 
The  direct  extension  of  the  proof  of  Theorem  2.3  to  the  case  of 
q > 1 seems  quite  feasible.  The  components  of  the  matrix 

in  (2.4)  are  known  for  all  q [see  e.g.  Anderson  (1971a).?  Section  5 »7*3l 
WggXr)  will  be  a Toeplitz  matrix  with  equal  elements  along  its  central 
diagonals;  and  zeroes  els ewhere/  the  components  of  the  inverse  of  such 
matrices  are  given  as  functions  of  the  roots  of  an  associated  polynomial 
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equation  in  Mentz  (1972)-  It  will  be  necessary  to  prove  some  properties 
of  these  roots#,  corresponding  to  Ix-J  < 1 in  Section  2.2.  (in  fact 
H22(p)  is  definite#,  and  can  therefore  be  taken  as  the  covariance 

matrix  of  a stationary  moving  average  process#,  the  argument  in  Anderson 
[( 1971a) #,  pp.  224-225]  that  we  referred  to  in  Section  1.3#,  together  with 
the  positive  definiteness#,  will  show  that  half  of  the  roots  are  less  and 
half  larger  than  one  is  absolute  value.,  as  was  the  case  in  Section  2.2 
when  q ..=  l) . These  properties  would  then  be  used  to  simplify  the 
resulting  expressions  and  to  turn  them  into  sums  of  random  vectors  whose 
order  of  dependence  is  a function  of  k#  so  that  an  extension  of  the 
procedure  in  Section  7 *3  *3  can  be  developed  to  give  the  asymptotic 
normality. 

The  evaluation  of  the  limiting  covariance  matrix  might  envolve 
heavy  algebra#  according  to  our  experience  in  Section  7*3*4. 

The  proofs  in  Sections  4.2  and  4.3  relied  upon  the  use  of  Lemma  4.4#, 
which  implies  the  knowledge  of  an  exact  closed-form  expression  for  some 
components  of  E ^#,  in  terms  of  the  a.  parameters.  That  could  also  be 
derived  from  Mentz  (1972)#  since  the  roots  of  the  polynomial  equation 
associated  with  E can  be  written  as  functions  of  the  a..  However,  the 
amount  of  algebraic  detail  in  the  proof  of  Theorem  4.6  makes  us  believe 
that  the  exact  treatment  of  k as  fixed  will  be  extremely  laborious . 

An  approach  such  as  that  of  Section  4.4  (applied  afterwards  in 
Chapter  5)  may  be  more  convenient.  The  approach  will  then  provide  the 
approximate  behavior  for  k large#,  of  the  parameters  of  the  limiting 
distributions  as  T -»«>#  and  be  based  upon  convenient  approximations  to 
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the  components  of  E 1.  Note  however  that  Durbin  [(1959);  Section  5] 
using  a. different  kind  of  argument ; obtained. the  limiting  covariance 
matrix;  valid  for  large  k. 

Finally;  and  as  it  was  pointed  out  earlier.,  the  attempts  at  treating 
k as  a function  of  T for  the  proposal  in- Chapter -4,  similar  to  what  was 
done  in  Chapter  2;  found  severe  mathematical  difficulties;  and  no  complete 
proofs  are  available  so  far;  even  for  the  first  order  moving  average. 

It  should  be  noted  that  the  main  difficulties  arose  in  the  analysis 
of  the  large-sample  behavior  of  M^1;  where  JL  was  defined  in  (4.5)  and 
is  of  order  k x k,  so  that  its  size  increases  as  k increases  with  T. 

In  Chapter  2 we  faced  a similar  situation  but  there  the  explicit  components 
of  H22^r^  could  be  obtained;,  because  g22(r)  has  only  a fixed  number  of 
nonzero  central  diagonals;  the  number  being  a function  of  q and  not  of 
k or  T.  Note  that  NL  has  all  its  components  nonzero. 
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7.  MATHEMATICAL  DETAILS  CORRESPONDING  TO  CHAPTER  2 


7.1  Proof  of  Theorem  2.1  (Section  2.3). 

The  components  corresponding  to  the  second  braces  of  (2.29)  will  be 

evaluated  first.  As  seen  in  Section  2.2  the  a1^  in  these  braces  have  a 
2k 

HP  o 

factor  x^  | if  we  treat  each  summand  separately,  we  see  that  the  larger 

2 

contributions  come  from  terms  of  the  form  k « 
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For  large  T (and  k^)  A is  approximately  equal  to 
a12  a21  ^ (~>°  ®lnce  I rs  I < 1,  for  large  enough  T 
of  (7»l)  is  bounded  by  a constant  times 
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The  condition  ixpl<  1 implies  that  the  two  series  in  (7»2)  converge, 
and  hence  (7.1)  is  negligible  as  T -4  oo„  The  argument  can  be  used  to 
show  that  each  component  in  the  second  braces  of  (2.29)  converges  in 
probability  to  zero. 
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The  argument  cannot  be  used  with  the  first  braces  in  (2.29)  though, 
because  there  the  components  do  not  have  x1  as  a common  factor.  We 
have  to  show  that 


(7.3) 


V1 


plim, 


I n,  x"  = 0 


T 4 00  . * 1 +1 , T “1 


Hence  we  have  to  show  that  given  e and  5 positive,  there  exists 

an  integer  T - T (e.s)  such  that  T > T implies  that 
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here  we  use  the  notation  to  emphasize  its  dependence  on  T 

(through  r1T). 

Let  n be  a fixed  positive  integer  function  of  e and  cc  only, 
that  will  be  mads  explicit  below.  We  have  that 
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To  arrive  at  the  second  inequality  we  used  that  |x  | < and  that 

I^t1  < 1- 

Since  plim  r = 0 for  j = 2,3 , » . . ,n,  there  exist  integers 

T.  = T.  (e,S)  such  that  T > T.  implies  that 
0 J 0 
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In  the  second  term  of  (7»5)  we  have  that 
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There  exists  an  integer  T*  = T*  (e,&)  such  that  if  T > T*  then 


(7.8) 
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because  plim  x^T  = =«.  Hence  the  first  term  in  (7-7)  will  be  less  than 
&/.3  if  ! > T*,  provided  only  that  • 

£ 

(7-9)  (i-|a|  ) — > (e  + ]a!  )n  . 

1 + - 
2 

This  defines  n as  a function  of  e and  a,  independently  of  T or 

V 

Similarly  the  second  term  in  (7.7)  will  be  less  than  §/3  provided 

T > T*  (e,6 ),  say.  Let  Tq  = maxCT^...,!^  T*,  T*);  then  (7.4)  holds 

for  all  1 > T , as  desired, 
o 

r-  i 

A similar  argument  will  show  that  terms  like  ) „ _ i r„  _ xnm 

1=1  l+l,  T IT 

converge  stochastically  to  zero.  This  completes  the  proof  of  the  theorem. 
Q.E.D. 

7.2  Proofs  of  Lemmas  2.1  and  2.2  (Section  2.4). 

Proof  of  Lemma  2.1.  Suppose  that  (2.31)  holds.  Then  limm  . km/log  T 
— — 1=#  00  T 

= + 00,  and 

(-m  log  a)kT  (-m  log  a)kT  ~n  log  T 

n m “ 1 ~ lim  — ~ ~ °r~ — sr—  1 — • = ,+  £»„ 

1-4  cx)  n log  T T-4  00  . n log  T 

This  is  turn  implies  that  n log  T + k^n  log  a = log  (Tn  a^T)  converges 
to  - 00,  which  is  equivalent  to  (2.30). 

Suppose  now  that  (2.30)  holds  but  that  (2.31)  does  not.  Then  there 
exists  a subsequence  {T  : u = 1,2,...}  such  that  for  every  d > 0,  if 
T is  large  enough 
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(7.10) 


log  T /k^  > d | 

u 

multiplying  (7. 10)  by  n we  deduce  that  for  every  d > 0 

:(7 « 11 ) log  - n d kT  > 0 . 

u 

If  in  particular  we  let  d = (-m  log  a)/n  >0  in  (7.1l)  we  contradict 
(2.30)o  This  completes  the  proof.  Q.E.D. 

Proof  of  Lemma  2.2.  Let  t]  and  e be  positive  and  fixed.  For  M > 0 
we  have  that 

p{|zTl |ytI  > = p{|zTl |ytI  > n,  |zTl  < m}  + p{|zTl |ytI  > ^ |zTl  > m) 

< P{  I YT I > jJ  , | ZTi  < M]  + PC  1 ZT|  > M} 

< p{|ytI  > $}  + p{|zTl  > M)  . 

But  P(|zt|  > M]  < P{  | Z | > M]  + e if  T is  large  enough,  since  by 
hypothesis  Z converges  in  distribution  to  Zj  if  M is  chosen  appro- 
priately, then  P{ j Z ] > M]  < e too,  by  hypothesis.  For  that  choice  of 
M,  P{|YT|  > t)/m}  < e if  T is  large  enough,  since  Y^  converges  in 
probability  to  0.  This  completes  the  proof.  Q.E.D. 
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7.3  Proof  of  Theorem  2.3  (Section  2,4). 


7.3-1  Part  2 (Simplifying  the  mIT1(j),s). 

-L 

We  substitute  (2.37)  into  (2.34)  and  find  that  we  have  to  deal  with 

a >JST 

PT  - Py&)  = I + X1T  m2,T(>)-l)lTJT  - py(l) 

t] 

(7.12) 

kk 

+ x.,  m p (l)  . 

IT  y v ' 

The  two  quantities  in  brackets  in  the  last  line  are  of  the  same  nature, 

and  it  will  be  shown  below  that  the  first  one,  normalized  by  /T  , has 

a limiting  normal  distribution.  Since  the  second  bracket  has  a factor 
\kT  \kT 

of  and  plim^  ^ x^  = 0,  we  see  that  the  claim  will  be  proved 

r—  1 i kkm 

if  plim^^/T  |x1Ti  Py(l)  = 0. 

Let  e > 0 be  given.  For  any  fixed  q satisfying,  say, 

0 < t)  < (l/2)  ( jo£  | + l),  we  have  that  |ot  | + q < 1,  and  by  Lemma.  2.1 

kk 

(7.13)  lim^/T  (\a\  + t))  1 - 0 . 

Hence  there  exists  an  integer  T^  =.  T^(e)  such  that  if  T > T^,  then 
kk 

/T  (M  + n)  < e . Hence  if  T > 


kp 

kk 

f rjT  - py(l) 

, . T 

X1T 

k(y, 

E 

i=i 
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p 


,XkT 

X1T 


< P 


Xkm  \k 

> /t  C|a|  + n) 


T 


! 


(7.14) 


= P 


> |a 


< P 


This  last  expression  ean  in  turn  be  made  arbitrarily  small,  because 
plim  = - ol,  as  T ->  oo  . 

Hence  we  concentrate  on  nr  (j).  From  (2.29)  and  the  argument 

J- ) 1' 

following  that  expression,  m (j)  is  the  part  of 

1 


(7.15)  - x^T  [2r(l-r2) (a11  + j a21)  + r2(a12  + j a22)] 

*-kT 

not  having  x1  as  a factor.  To  find  the  desired  limiting  distribution 
this  can  be  taken  as 


(7.16) 
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7-3.2  Part  3 (Substituting  parameters  for  random  variables  in  the 


"»!  T(J  )'s). 


Iro  111  | <v  | 

X1  | = p|  < l,  there  exists  r]  > 0 such  that  |x^+  r]|  < 1 


Then 


kT-i 


/T  l H - 


■3=1 


X1T“  Xl;Cd+l,T 


<f  l 

j=l| 


X1T  V3  I X1  ^ 


\ Xl+Tl/  \X1  +T1/ 


\ cl 

(xlV)  Cj+1,T 


(7.17) 

As  in  the  proof  of  (7.4)  let  us  introduce  a fixed  integer  n/  to  be 
specified  below,  so  that  (7.17)  becomes  bounded  by 


n-1 

/T  I 
j=l 


'IT  1 

d j 

f xi ' 

Id 

iv'ij 

lxi+7>i 

(xi.+4)D  Cj+1^T 


/T  l 
3=n 


X1T 

3 _ / *!  1 

d 

Xl+4j 

\V'il 

(x1+n)t]  Cj+1^T 


(7.18) 


x±  \0 


\X1+Ti  l 


I h1+t23  [/?  =Jtl,T 

0 i 


f 3= n 


— 

f X1T 

1 ~ 

j 

X! 

7™"  " 

Xl+Tl/ 

ixl+T1i 

h2 


',2d  m 2 


I T c 

d=n 


i " ~j+i,t  ’ 


where  we  have  used  the  Cauchy-Schwarz  inequality. 
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In  the  first  factor  of  the  first  term  of  (7.18),  for  any  fixed  n, 

from  the  fact  that  plim  = x^,  we  conclude  that  the  whole  factor 

converges  in  probability  to  zero.  In  the  second  factor  we  note  that 

/T  (c^,  * • • s cnp^  is  asy^-Ptotically  normally  distributed  with  zero 

expectations  and  finite  variances  and  covariances  [cf.  Anderson  (1971a), 

Corollary  8.4.1].  Hence  the  distribution  of  the  sum  behaves  asymptotically 

like  that  of  a linear  combination  of  the  squares  of  n-1  normal  random 

\ 2 n 

variables,  with  weights  given  by  the  (x^+t))  . It  follows  that  its 

square  root  satisfies  the  hypotheses  of  the  of  Lemma  2.2,  and  hence 

that  the  first  term  converges  in  probability  to  zero  as  T oo. 

To  deal  with  the  second  term  in  (7.18)  we  require  that  lxpj/  (xp+  il)  I < 1# 
with  high  probability.  But  for  t]  > 0, 


x 


IT1 


< 


(7.19) 


> P 


x. 


1T! 


< x., 


say, 


> P<  x. 


IT 


and  is  arbitrarily  close  to  1 
For  all  choices  of  T satisfying 


< 


■} 


4 > >1-8 


> 


if  T is  sufficiently  large. 
(7.19)  we  have  that 
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and  the  second  probability  will  be  less  than  some  arbitrarily  small  5 > 0. 
In  the  first  probability,  since  both  arguments  are  less  than  one  in  absolute 
value,  the  infinite  series  can  be  evaluated  explicitly,  its  value  being 


Since  x^T  x^,  this  converges  in  probability  to  zero  as  T oo,  for  any 
fixed  n»  Hence  the  right  hand  side  of  (7. 20)  can  be  made  arbitrarily  small 
for  T large  enough.  This  shows  that  the  first  factor  of  the  second  term 
of  (7.18)  is  asymptotically  negligible. 

In  the  second  factor  we  apply  Chebyshev’ s inequality.  For  any  e > 0, 
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I, 


l (S1+1)2J  I c2  > e 
=n  u ’ 


Y ^1+Tl)  T $ c j +1 , T 

> <^— p~ 


(7=21) 


-7  JL 


i f (x1+r|)2(‘J‘1)  | £ #’(et+“et-l^et+j +ast+j-l- 

s,t 


2 i-  V^1 
e j=n+l 


The  expectations  vanish  unless  t=s,  t=s-l  or  t-l=s,  because  the  e^’s 
are  independent  and  have  zero  expectations.  There  are  less  than  3T  such 
nonvanishing  expectations,  each  one  of  which  is  bounded  by  the  same  constant, 
because  the  e 1 s are  normally  distributed.  Hence  the  absolute  value  of 
(7.21)  is  bounded  by  a constant  times 


(7.22) 


00 


l IV-il 

j=n+l  ^ 


2(3-1) 


This  last  expression  defines  the  choice  of  n,  as  a function  of  Cd,  e, 
etc.,  but  independently  of  T and  kT,  so  that  the  right-hand  side  of 
(7.2l)  is  made  arbitrarily  small. 

This  completes  the  proof  that  (7-18)  converges  in  probability  to 

zero. 
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7»3»3  Fart  4 (Hie  asymptotic  normality), 
As  in  (2 A3)  let 


(7.23) 


1 


2t  = /?  tii  "tT  ’ 


where  the  W^'s  are  defined  in  (2.44 ).  To  develop  the  asymptotic  theory 
and  in  order  to  simplify  the  calculations , one  can  take  as  definition  of 
the  W.  ? s for  all  t,  t = 1,2, . . „ ,T,  the  first  line  of  (2.44).  There 
would  be  k^/2  extra  terms  added  in  the  sum  over  t,  but  this  is 
asymptotically  negligible  compared  with  the  existing  Tk^  terms , since 
k^/T  *■»  0 as  T oo.  Hence  we  take 


(7.24; 


•krp 

wtT  = I (y*  ”t+J  - A W’  * = 1>2> 


j=o  <r  (nor) 


and 


(7.25) 


m(-l) 


a 


nor 


m(d)  = 1+j  /l-4p£ 


= (~a)J 

K+.  i^2' 

i+j  — -p 

( iHa 

3 — 


Am“4  . 


(7.25)  can  be  written  more  compactly  as  m(j)  = g*  (-a)^ [1+j  (l-a2)/(l+a2)] 

v *] 

where  5 equals  =.  when  j = -1  and  equals  1 when  j = 0,1,. • . ,k_-l. 

J £—  -L 

Taken  as  a stochastic  process,  {W^}  is  weakly  stationary,  has 
zero  expectations,  is  finitely  dependent  of  order  k^+l,  and  finitely 
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correlated  of  order  1.  The  dependence  follows  because  Wgr^  depends  on 
yt>«.*>yt+k  j and  hence  on  . . . , , while  Wt+s^T  depends 


on  y+.w->y. 


t+s,*”,Jt+s+k 


and  hence  on  e 


T 


t+s-l'*“>  t+s+k, 


. The  correla- 


T 


tion  argument  follows  because 


£fw  W 
G tT  t+s,T 


kT 


k„ 


I I m(j-l)  m(j 1 ~l)  £(y.y. 

[a-v(0)]2  j=o  J.to  tt+J 

O 


t+syt+s+j ! ~ €^t+syt+s+o ' ^ 


(7.26) 


^ k,j,  k^, 

- p I I m(j«a)  m(d'-l)  d. (s) 

[o-v(o)]2  d=0  o’=0 

«y 


kn 


"T  kT 

[ I 

(1+c^) 


2 I I m(o-l)  m(o’-l)  d (s) 


Here  #(ytyt+i  yt+s  yt+s+;j  - #ytyt+i  Ssvsyt+3+J)  = a dij(s)’  ani  016 


d_(s)  are  given  by 


90 


« 2 

2(1^) 

' 3 

s=0. 

1=1 =0, 

2 4 

1+30C  -KX 

3 

s=0. 

1=1=1, 

2 2 
(l+a  ) 

3 

s=0. 

i=J>lj. 

20!  (l-fa2) 

3 

s=0. 

(l,l)  = 

(0,1)  or  (l,0), 

a (l-fa2 ) 

3 

s=0. 

| i=j  | =1, 

(1,1 ) t (0,1)  or  (1,0), 

2a2  , 

S=1 

(-D, 

1=1=0, 

a2  , 

S=1 

(-1), 

i=j>0. 

22(i-ta2) 

3 

S=1 

(-D, 

i=0,  j=l  (i=l,  j=0), 

a (1+a2 ) 

3 

S=1 

(-1), 

i=j  -1,  1>1,  (i=j+l,  jX) 

ha2  , 

s=l 

(-1), 

1-0,  j=2  (i=2,  j=0), 

a2  , 

s=l 

(-D, 

1=1-2,  i>2  (i=i+2,  j>0) 

0, 

all 

other  possibilities. 

To  prove  (7.27)  we  write  y^_  = + ae^  for  each  index  t,  enumerate 

all  possible  cases,  and  use  the  fact  that  the  e * s are  independent,  normal 
and  have  zero  expected  values.  Alternatively  one  could  use  formula  (8.18) 
in  Section  8.4.2  directly. 

We  proceed  now  as  in  Anderson  [ (1971a),  pp.  538-539].  Let  [W^]  be 
a sequence  of  integers  (functions  of  T)  such  that  -*  0 as  T 00. 

Let  be  the  integer  part  of  Then  is  asymptotically 

equivalent  to 
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(7.28) 


sjq  A 


£ (v  + V+iH 


/T 


Even  for  finite  T,  the  approximation  problem  is  minor  because  R^/t 
may  differ  only  slightly  from  l/M^  In  (7-28)  we  defined 


1 


NT_kT 


z—  = “ I wn 

'T 


/c  i=i  V1*1 


D —1  }2.)  • • • ) Mrp  ) 


N, 


T 


(7.29)  Y1T  = 7=  I W(  )H+iT' 

3 i^-kj+1-  1;  NT  1,T 


<1=1,2, 


'*T 


R = W +?  - • + W ; 

t WNt  M^l  t ' 


the  last  definition  is  void  if 


ntmt  = 


= T»  in  which  case  we  set 


V= 


0. 


We  first  show  that  the  terms  involving  the  random  variables  and 

Rt  converge  in  probability  to  0 as  T -*  co„  To  do  so  it  suffices  to 
prove  that  the  corresponding  second-order  moments  converge  to  0,  because 
the  expected  values  are  zero  for  each  T.  This  corresponds  to  proving 
mean-square  convergence  to  0.  Row 

.2 


(7.30). 


1 ^ 

— 5L  r Y 

^ A dT 


1 


w, 


T 


Vt  j/j5  =1  s,s‘  =RT-kT+l  ^ O^-l^+s.T  J ' RT-kT+s’  ,T 
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If  j/g ' the  expectations  vanish,  because  then  the  corresponding 
W s are  independent,  their  subindices  differing  ‘by  at  least  N^-k  = For 
the  expectations  vanish  unless  | s — s * | <1,  because  of  (7=27). 
Then  by  stationarity  of  the  {W^}  process,  (7  = 30)  equals 


\ (K,-1 

dt  I, 

x T ,i=l  \ s=l 


V2 


+ 2 


s=.l 


IT  2TJ 


(7=31) 


V1  /3T12  . , V2 


L " + 2 Nm  ^W1T  W2T  * 


T 


NT 


which  converges  to  zero  as  T -»  00  since,  by  hypothesis,  k^/N^,  -*  0 = 
That  the  second  moments  in  (7=31)  remain  finite  as  T co  follows  from 
(7=26)  and  (7=27),  once  we  note  that  the  m(j)’s  are  exponential 
functions  of  0!,  and  joe  | < 1. 

The  same  kind  of  argument  can  be  used  with 


1 

T 


f(K 


+1 , T 


+ 0 < 


w Y 

TT 


frr 
\ I 


32; 


+ 2JL 


T, 


W W ! 

1 IT  2T 


and  this  tends  to  zero  since,  by  hypothesis,  N^/T  tends  to  zero  as 
T 03  „ 
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It  follows  that  it  suffices  to  find  the  limiting  distribution  of 


(7-33) 


* 


“r 

I Z 


0=1 


jT 


Where  by  construction  the  Z^' s are  independent,  identically  distributed, 

and  for  all  j and  T,  Jz.m  = 0, 

v jT 


^ZjT  N, 


1 


T 


(VV'^T  + ^T"kT"l)  2gWlT  W2T 


(7.34) 


+2 

ntJ^it 


kT+1\ 

1 “ Nm  ^W1T  W2T 


If  we  now  write  (7.33)  as 


(7.35) 


Hp  z.  “r 

°T  v *c  “IT 7 , s (^Zl  t ^ £ 2 i 

j=i 


si*  - (f  z2  ) X 


1 • 1T'  A “iT  ’ 


x * 

we  have  that  ZjT  °’  -1  = We  wan'*:  4o  use  Liapounov' s 

Central  Limit  Theorem  [see  Loeve  (1963),  Chapter  1/1] j for  that  it  suffices 
to  prove  that  for  some  B > 0, 


A 2 

"3=1  ZJ"T/ 


(7.36) 


Mr 

liJBT->oo  ,X  ^zdT|2+6 
0=1 


- 0 


We  choose  6=2.  Then 
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(1-31 ) 


1 


,4 

‘IT 


where 


(7.38) 


.4 

It 


4 


Vb 

L;  y S g CJ^ 


^WtT  WsT  WqT  WyT 


and  it  suffices  to  show  that  (7.37)  converges  to  zero  as  T oo,  or  (more 
strongly)  that  (7.38)  is  bounded  uniformly  in  T. 

Note  that  a fourth-order  moment  of  ¥ includes  the  expectation  of 

g 

a product  of  eight  of  the  s' s (in  particular  that  of  e when  s=t=q=v, 
and  j=0  in  the  definition  (7.24)  of  each  W)j  since  the  e's  are  normal, 
these  eighth -order  moments  are  finite.  If  instead  we  did  not  assume 
normality  of  the  e's,  some  assumption  about  their  eighth -order  moments 
would  be  called  for.  In  any  case,  any  fourth -order  moment  of  the  W’s 
is  bounded,  uniformly  in  T. 

To  analyze  (7.38)  we  consider  separately  the  following  five  cases ; 

1)  t=s~q=v.  There  are  It  terms  $W  , so  that  their 

— — — i x ■ tT 

contribution  is  negligible  as  T oo. 

2)  t=s=q^y.  There  are  4 (K  -k^) (W^-k „-l)  terms  of  the  form 

c 5 

fe¥g^  W sc  that  their  contribution  to  (7.38)  remains 
bounded  as  T oo.  Note  that  4 (E^-k^-l)/®^  converges 

to  4 as  T oo. 
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3) 

4) 


5) 


(7.39) 


t=s^q-v.  There  are  3 (l^-k^  (lTj-J^-1)  terms  $ W^T 
so  that  their  contribution  is  also  negligible. 

v=t,  th»  s/^.  There  are  6(^-4^)  (h^-k^l)  (Nj-k^) 

such  terms0  Let  us  consider  the  subcase  t < s < q,  since  the 
other  ones  are  treated  similarly.  If  | t — s | > k+1,  and 

Ws^  are  independent  and  the  expectation  vanishes  unless 
|s-qj  < lj  there  are  at  most  2 (N^-k_) (k_+l)  such  terms. 

If  | t—s  | < k^+l,  then  and  WgT  are  not  independent 

and  the  expectation  may  not  vanish  if  |s-qj  < k_+l|  there 
are  at  most  (NT~kT)[2(kT+i)]2  = 4 (N^-k^  (kT+l)2  such  terms. 


All  subindices  differ.  There  are  (N^-k  ) (N^-k^-l) (N^-k^-2) 
(NT~k^“3)  such  terms.  Consider  the  subcase  v < t < s < q, 
since  the  other  ones  are  treated  similarly.  By  definition 
(7.24),  and  recalling  the  y^  = + Qie+  ^ we  see  that  (7«38) 

is  composed  of  terms  equal  to  a constant  times 


T T T 

-~k  I X m(j-l)m(j»  =l)m(j"  -l)m(jMI  -1‘) 

t,  s,  q,  v=l  d»dSd,,^’"=o 


i 


v “v+j  "t  '"t+j'  "s  '‘s+j"  "q  "q+Qw  “ ^£v  £v+j^ 


e e , . e,  e.  , e e , e e 


^•et  £t+j’  ^ ^^es  €s+j"^  Gq+tjm  ^ 


plus  other  similar  terms  with  some  of  the  subindices,  or  all  of 
them,  reduced  by  1. 
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In  (7.39)3,  if  then  and  e'v+^  are  independent.,  and 

since  <fe  =0  the  contribution  vanishes.  If  j=0,  but  again 

V 

we  have  a zero  expectation.  By  a similar  argument  we  can  see  that  only 
the  case  j=j’  = j"  = j'"  - 0 remains  to  be  studied;  but  then  we  have 

that 


(7.4o) 


e2  e2  ed) 
v v v t s q' 


2 2 2 2 
*6s  n = 0 


For  the  other  terms  with  subindices  reduced  by  one;,  a similar  argu- 
ment applies  if  v>t,s,  and  q differ  by  at  least  (say)  3 units. 

Hence  it  suffices  to  show  that  in  terms  like  (7«39)>  when  v=t, 
jt-s|  < k^+l,  | s -qj  < k^+1,  t < s < q,  the  corresponding  contribution 
to  (7.37)  tends  to  zero  as  T -$*  00.  In  the  analysis  of  case  4)  above  we 
argued  that  there  are  at  most  4 (N^-k^) (k^+l)  such  terms.  Now,,  by  the 
Cauchy -Schwarz  inequality <,  the  expectation  part  is  bounded,  for  all 
choices  of  subindiees,  by 


(7.41) 


?e8  + (fej)  = 105  a-8  + cr8  = 106  cr8  , 


so  that  the  contribution  is  bounded  by 


(7.42)  106 


cr 


1 


4 (NT-k..r)  (kT+l)^ 


MI,(tzLT) 

which  is  asymptotically  equivalent  to 


4 


k„ 


LJ-o 


m(j“l) 
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(7-43) 


424  o-8(NT-kT)(kT+l)2 

— 

(<fZ^T)  T Nt 


\14 


1 e-  l« 


l + 3 


. 1 -or 


3=0 


1+cP- 


/J 


in  turn  this  is  equivalent  to  a constant  times 


(7.44) 


(kT+l} 

______ 


CO 

T 

, . • 1"« 
'l^jj 

i L 1 • 

U=o 

4 


* 12/ 

Recall  that  8 . can  equal  only  1 or  i.,  Since  k /T  0 as  T ->  oo^ 
3 2 1 

and  the  sum  over  j is  finite  because  |o£|  < 1,  (7.44)  tends  to  zero 
as  T -»  oo,  which  is  what  we  wanted  to  prove. 

From  (7.34)  we  see  that 

(7.45)  lim^  ot<?Z2t  - lim^  ^ (^T  + 2|W1T  W^)  . 


By  Liapunov's  Central  Limit  Theorem  we  conclude  that  (2.43)  or  (7.23)  is 
asymptotically  normally  distributed  with  parameters  0 and  r given  in 

(2.45). 


Mote.”  From  the  proof  above  it  follows  that  random  variables  like  (7.23 ), 
which  are  (normalized)  linear  combinations  of  random  variables  finitely 
dependent  of  an  order  (k^+l  in  our  case)  that  increases  with  T,  are 
asymptotically  normal  provided  the  rate  of  increase  of  the  order  of 
dependence  is  adequately  smaller  than  T(k^/T  ->  0 in  our  case);  and 
that  the  weights  (the  m(j ) in  our  case)  are  summable. 
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Recently  Berk  (1973)  proved  a theorem  that  deals  with  a similar  situation. 
This  same  author  [Berk  (197*0]  used  an  argument  parallel  to  that  used 
above  to  prove  the  asymptotic  normality  of  the  autoregressive  spectral 
estimator?  in  his  case  it  turned  out  that  he  needed  kp/T  -»  0 (in  our 
notation). 


7.3  ° 7 Part  9 (The  asymptotic  variance), 
We  first  note  that 


(7.46) 

I “2J  = , 

OO 

IP 

j=o  i-cr 

d=i 

Next 

a 


00  ,2-2  3 a2  (1+Q2) 


(l-a2 ) a 


I f* 


(l-a2) 


\ = (1+oi2) 


kT  kT 


i--o  J'=0 
k, 


T 


Z mP(j~l)  d (0)  +2  £ m(j-l)  m(j)  d (0) 


m(-3“l)  m(j!  -.1)  d (0) 

d d 


V1 


0=0 


33 


3=0 


'djJ+l 


(7.47) 


kT 

m2  (“1 ) dQ0(0)  + m2(0)  i.,  (0 ) + d (0)  £ m2(j=l) 


-j=2 


V1 


+ 2 m ( -1 ) m(0)  d01  (0 ) + 2 <3^(0)  £ m(j-l)  m(j) 


D=1 
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2 (Ha2 ) + (1+3  a2  +a4)  + (i+o:2)  £ m2(j-l) 

l-tor  I j=2 


- 2 — ^ 2 ct(na2)  + 32  (na2) 
n or 


I m(5-i)  m(5)> 
5=1 


which  converges } as  T ->  oo,  to 


1 + a‘  + a 


2 2 2 
+ (l+Q!  ) £m  (5-l)  + 3X(1+<X  ) 

5 = 2 


I m(j-l)  m(j) 
5=1 


(7.^8) 


1 + a + cf 


2 2 
(1  -kx2)  m2(0)  + .(l+a2) 


00  o 

t m (5-1) 
5=1 


+ 2a(i+a2)  f m(5-l)  m(j) 
5=1 


P P »—  P P 

= - a + (l+a  ) Y.  m (5-1)  + 2a(i+cr) 
5=1 


Y m(j-l)  m(j)  . 
5=1 


Similar ly. 


R2  = (1-KX  ) ^W1T  W2T 


kT  kT 


= Y I m(5-l)  m(5'-l)  Cl..,  (1) 

U J 


5=0  5 '=o 


T 


= I m2(j-l)  d (1)  + Y m(5-l)  m(5)  d (l) 
5=0  5=0  3>3+± 


kT-! 


(7-49) 


kT-2 

+ Y m(5-l)  m(5+l)  d^ j+2(l) 


5=0 
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= m2(-l)  d00(l)  + di;L(l)  Y. 


kT_1 

+ m(-l)  m(0)  dQ1(l)  + d12(l)  I ^j"1)  m(j) 


+ m(-l)  m(l)  dQ2(l)  + d13(l) 


V2 

I m(j-l)  xn(j+l) 

j=l 


a 


l+a 


2 2 

2 a + a 


X m2(j-l) 

j=0 


\ 


+ 


2a(l+a2)  + a(i+a2) 


Y m(j-l)  m(j) 
j=l 


+ 


a 


l+a 


(-a) 


l + 


l-a 

l+a2/ 


2 2 

2a  + a 


V2 

Y m(j-l)  m( j+l) 

j=l 


2a 


(l+a2) 


2a 


to4 

(l+a2) 


a 


Y m2(j-l) 

j=l 


+ 


O 

a(i+a  ) 


T 

I m( j-l)  m( j)  + a2 

j=l 


Y m(j+l)  , 

j=l 


which  converges  as  T ->  «>  to 
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Got 


O C 

(nor) 


p o “ o p 00 

- 2a  + a X m (o-i)  + a(i+cr)  J m(j-l)  m(  j ) 


J=i 


(7-50) 


0=1 


+ a Y m(  j-l)  m(j+l)  • 
0=1 


Hence  R^  + 2R^  converges  as  T -»■«>  to 


- 5Q2  + — - + f m2(j-l);  ((1-tct2)2  + 2a2] 
(1+a2)  J=1 


+ £m(j-l)m(j)  [ta(n-a2)] 

0=1 


(7^51)  + Y m( j -l)  m(j+l)  [2Q2] 

0=1 


+ — 12a  g + (1 +ba2  + a4) 

( 1+a2) 


00  o 

1 m ( 0) 

0=0 


+ ta( l+a2 ) 


I m( j)  m( j+l)  + 2a2 

0=0 


I m(j)  m(j+2)  . 
0=0 


Next  we  evaluate  the  following: 
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I m (o)  = I 


a 


2o 


1+0 


. l-a 


2\  2 


3=o 


o=o 


l +a2j 


= £ a2^  + 2 
0=0 


2 oo 

^ I + 

l+a  j=i 


l-a2 
1 +a2 


\2 


I o2  ^ 

0=1 


I m(  0 ) m(o+l)  = I (-a)2J+1 
0=0  o=o 


i+lo+D  ^ 

l+a  / 

l+a 

(7-52) 


a £ a' 
0=0 


2o 


L\ 


\2 


1+0 


l-a 

l+a^ 


l+a 


+ i+J  1-a‘ 


l+a- 


aim  (o) 
0=0 


l-a 


i “2i  |i+o 

l-ta  j=0  \ l+a" 


I m(j)  m(j+2)  = J a2J+2 
0=0  0=0 


1+0 


l-a 

l+a"" 


l+(  0 +2 ) 


l-a 

l+ac 


= aH  Y.  m^(o)  + 2a2  Y a' 

o=o  no  j=0 

Using  these  valhes  the  last  line  of  (7«5l)  becomes: 


Y20 


1+0 


. l-a 


l+a 


.5a2  + -12Q'  + (1+4  a2  + a4)  £ m2(j) 

(l+a2)  j=0 


+ kx(l+ai  ) 


o°  2 

a l m2(o)  -Hi 


0=0 


I a: 


2o 


l+a  j=o 


1+0 


l-a 

r 

l+a 
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(7-53) 


+ 2a 


..2  T-  2/  . \ , ^„,2  1 | a2J 


a2  (j)  + 2ac 


L J=0 


l+a  j=o 


, . . 1-a 

1+J  — g 

l+a 


5a2  + — ■ ^2a-2'  + £ m2(j)  [l+4a2  + a4  - 4a2(i+a2)  + 2a4] 
(l+a2)  j=0 


+ l a2^ 
0=0 


l-Q2' 

l+o  2 

!+a  , 

-4a2(i-a2)  + to4 

l+a 


5a2  + — 12a  g + (i-a2)(i+a2) 
( 1+q;2) 


00  ~ . ,2  00  . 

1 l 3a2J 

0=0  l+a  0=1 


l-a2^ 

2 

2 00 

I 0 « J 

4a2  ( l-a2) 
2 

l+a  1 

0=1 

l+a 

00  ~ „ 
I a J 

0=0 


2 00 

♦ ^ 1 

l+a  j=i 


,2  . 12a 


5a  + 


( l+a  ) 


+ £ a' 
2<~  0=0 


,2j 


( l-a2 ) ( l+a2 ) - 4a2 

1 +ac 


+ £ ja' 


0=1 


2 f 2 ' 

2(  l-a2)  -ta2,1-“ 


I / °2J  ii4) 


,3 


0=1 


l+a 


io4 


-5a2  + _JSiL  + (l^i.  f a2j  + g.(.l-a2)  (lH)  f Ja2j 


( l+a  ) 


2n  HQ!  j=0 


( l+a  ) 


2\  j=l 


.2  2j 

n rv  O 


2 Ija' 

(Ha2)  j=l 


.4 


As 


_5a2  + 12a  _ + (l-QJ  . _1_  + 2(  l-a  ) (l+cT)  o a' 

(na2) 


2 \2  l+a2  1-a2  - 2 - 2 


(Ha2)  (1-a2) 


( l-a2)  a2  (l+a2) 


l+a 


(1-a) 


.5a2  + k + .(1-a2)  + ^(W*)  + ^ 

(l+a2)  11Q  (l+a2)2 


2 2 

(-5a2  + a2)( l+a2)  + (l-a2)  (l-ta2)  + 12a ^ + 2a2(i+aS 


(l+a2) 


2 4 6 
l-5cr  + 3a  - a 

2 2 
( l+a2) 


(l+a2) 
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8c  MATHEMATICAL  DETAILS  CORRESPONDING  TO  CHAPTER  4. 


8.1  Proofs  of  Lemmas  4.2  and  4.3  (Section  4.2). 

Proof  of  Lemma  4.2. 

We  need,  to  show  that  plim^^l/T)  ( z*  - ^z*)  = 0*  Let  us 

write  T = mp+r#  where  p and  r are  integers  and  0 < r < m.  Let 

also  z.  = z*  - fz*.  Then 
t t t 


(8.1) 


m 

< I 

J=1 


P-1 


s= 0 


z 


j+sm 


+ — 
m 

T 

I z+ 

i 

t=pm+l 

1 

Pf 

z 

+ — 

T 

y z 

P 

s=0 

j+sm 

T 

t=pm+l 

(if-  r=0  the  second  term  in  the  right-hand  side  does  not  exist).  By 

hypothesis#  in  the  first  sum  of  the  last  line  above#  and  for  the  j-th 

subsequence  (j  = 1#2#.  ..,m),  lEg=o  Zj+sb/^  zs  arLitrarily  small  if 

p is  sufficiently  large#  if  each  of  these  summands  becomes  bounded  by# 

say#  T) < > C#  then  the1  whole,  term  is  bounded  by  tj  = max.  T]..  In  the 
J 3 J 

second  sum  there  are  at  most  m summands*  since  each  subsequence  con- 

verges  by  hypothesis#  each  term,  |z  | is  arbitrarily  small  if  s is 

s 

large  enough#  and  eventually  \z  | < tjj  then  the  whole  sum  will  be  bounded 

s 

by  (m/T)  T)  < T).  This  completes  the  proof  because  T]  is  arbitrary  when  T 
can  be  chosen  arbitrarily  large.  Q.E.D. 


Proof  of  Lemma  4 .3  ° 


From  (1.12)  we  see  that  for  fixed  i and  j the  random  variables 

zt  = zt  (i^j)  = yt_i  yt_j^  have  common  expectation.  Since  is 

normal,,  it  also  follows  that  Var(z_j.)  is  finite  and  does  not  change  with 

t.  Let  us  consider  i < because  the  same  argument  holds  for  i > j$ 

z depends  on  e s e , e , and  e . ? while  z,  , depends 
u t-j  t-x-1  t-x  t+s 

°n  Wj-l'  Vs-j'  Vs-i-1’  and  et+s-iJ  lf  then  Zt 

and  zt+g  are  uncorrelated.  It  follows  that  {z^}  is  a sequence  of 
finitely  correlated  random  variables^  with  finite  common  second-order 
moments.  By  Lemma  4.2  the  weak  law  of  large  numbers  holds,?  and  shows 
that 

(Q-2)  PliDW  | l z =gz.  . 

t=k+l  b 

This  result.,  together  with  (4.10),  completes  the  proof  of  the  lemma.  Q.E.D. 

8.2  Proof  of  Theorem  4.1  (Section  4.2). 

We  have  that 


(8.3) 


k-1 

A I (P3^V«biT)(P11,V».V1,T) 

pUjV«  aT = - i — FI ; 

Tn  (plimT^»biT) 

1=0 


since  all  relevant  plim' s exist.  The  numerator  of  (8.3)  is  evaluated 
as  follows: 
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(8.4) 


Y (_!,!  ( -x)i+1  ct21+1  (l-a2k+2)=2(l-a2k+2-21)(l-a2k+2-2i-2) 

i=0 


= -a( i-a2k+2 ) 


-2  k-1 


i=0 


(a21  - a2k+2  - a2k  + a4k+2'21) 


-a(  i-a2k+2 ) 


1 — Q 


2k 


kh 


2k+2 


1-a 


1-1, 


./a2 


= -a(i-a:2k+2) 


( l-a2) 


l< 


-a2k)(i+a2k+i|') 


ka2k(  i+a2)(i- 


■a 


The  denominator  of  (8.3)  is  equal  to 
to  c\  , , ^,2k+2y2  ^2i  2k+2-2iv2 

(8.5)  (l-a  ) }_  ot  (l-a  ) 

i=0 

= (l-a2k+2)"2( l-a2)”"  [(l-a2k)(l«2k+6)  - 2k  a2k+2(  l-a2 )]  . 

The  first  line  of  (4.9)  follows  immediately  and  the  second  line  is  an 
algebraic  rearrangement  of  terms . Q.E.D. 

8.3  Proof  of  Corollary  4.5  (Section  4.2). 

The  right-hand  side  of  (4.9)  is  (by  long  division  and  appro- 
priate collection  of  terms) 
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(8.6) 


a + a2k+1( i-a2 ) [a1- - k(  l-a2) ] 

+ Q^k+1(  1-Q2 ) [ -2k2a2(  l-a2 ) ] 

(l^Kl^**6)  - 2^2k+2(i-a2) 

+ a^k+1(i-a2){-a10  + k(l-a2)(5a6-l)  + a2k[a1Q-k(l-a2)a6]) 
( i-a2k)  ( i-fa2k+k ) 1 2ka2k+2(i-a2) 


The  denominator  of  each  fraction  approaches  1 as  k->oo.  Q.E.D. 

8.4  Proof  of  Theorem  4.6  (Section  4.5). 

8.4.1  Part  1 [Asymptotic  normality  of  /t  (p  -p*)]. 

1 r'v 

In  the  notation  of  Section  4.2,  j3*  has  components 


(8.7) 


p*  = ( -a)^ 


l-a2k+2-2J 


1-Q 


2k+2 


j — 1,2,  . . .,k  | 


in  fact  we  will  want  to  extend  the  range  of  (8.7)  to  include  j=0  (p 

2 

and  k+1  (p*  ■ = 0).  Since  a = 1 we  now  have  that 


(8.8) 


where  e=  ( 1,0,  . . .,0)’ , and 


(8.9)  = P = Z , 

so  that  gT  = - m"1^  and  j3*  = - z“1c|.  Then 
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o sfc 


/T  (|T-P*}  = ~/T  - z\) 

(8.10)  = - /T  jjz  + (IJj,  “£)l"1[q  + (mT-  q.)]  - S_1q] 

= (JSr-S)S"1]  1^+(St“S)]  • 

It  is  easily  checked  that  if  I + A is  nonsingular 

ntj  n*j  ~ 

(8.11)  (I +A)"1  = I - A+  (1+ A)"1  A2  . 

TV  (V  A/  (V  /V 


For  A = (M^  - L)L-1  5 I + A = E 1 is  nonsingular  with  probability 
one  because  has  this  property  (see  Section  4.1)  and  E is  also  non- 

singular. (in  fact  E of  any  order  is  nonsingular  for  any  value  of  ct, 
while  the  condition  |a|  <1  makes  E of  any  order  positive  definite). 

We  deduce  that  plirnm  ^ (M_-Z)  = 0 (Lemma  4.3),  plim  A = 0, 

x-^tac  wj.  ^ ^ X^  co  ~ ^ 

plimm v (l  + A)-1  = I,  and  that  Jr  A has  asymptotically  normal  components. 

X“TOO  ~ 

“■1  2 

[See  e.g.  Anderson  (1971a),  Section  8.4.2].  Hence  plim^^/x  (l+A)  'LA  = 0, 
and  (8.10)  has  the  same  limiting  distribution  as 


(8.12) 


- Qjr-£)£‘1Hq  + feT- fl-z'L} 

= -^{s't^-q)  - (fe-S)fL- (Mr-tfhmj-q)]}  • 


Since 


(Sr-S) 


has  asymptotically  normal  components,  and 
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Plimm,.  (mm  - 0.)  = o,  the  third  summand  inside  the  brackets  in  (8.12) 
is  asymptotically  negligible,  and  (8.12)  has  the  same  limiting  distri- 
bution as 

(8.13)  - /T  S_1[  (mT-  q)  + (^-Z)j3*]  = - /T  Z_1  (^  + ^ p*)  . 

8.4.2  Part  2 [Asymptotic  covariance  matrix  of  >/t  <St  + «r  &*)!■ 

We  now  evaluate  (4.26).  Using  (8.7)  and  (1.7)  we  have  that 


(8.14; 


k 

I 

h=0 


pn  yt-h 


k 

I 

h=0 


(-a) 


h l-a' 


2k+2-2h 


l-a 


2k+2 


l-a' 


2k+2 


k 

t (“«) 

0 


h 


[€t-h"(^)et- 


■h-1- 


-a 


k+2 


k 

I 

h=0 


(-a)k-h[e. 


t-h 


■(-a)e 


t-h 


1 

]f 

-1] 


l-a' 


f 

2k+2  '[  V 


■(■a) 


k+l 


£t-(k+l)  ”a 


k+2 


[(■a)1 


■(-a)e 


t-(k+l) 


(l-a2) 


k 

I (-a) 

h=l 


k-h 


't-h 
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l-a' 


2k+2  t 


{■ 


1 


) . (-a)k+2  (l-a2)  V (-a)k"h  e 

h=l  ■ 


t~h 


} 


k+1 


h=0 


h t-h 


say,  where 


(8.15: 


70  = ^ 7h 


(l-a2)  (-a|k+2(-a)k~h 


l^a1 


2k+2 


h = 1,2,.=., k+1 


Hence  (4.26)  reduces  to 


T k+1 


(8.l6)  lim  ^ , Z , 7h7h’ ^yt-iet-hys-jes-h’ ^ 1-i'  ^ -k 


s,t=k+l  h,h’=0 


We  have  to  evaluate  the  expectation,  namely 


(8.17  ) ^yt_iet_hys„jcs_hi  ) €(et-±  +aet-i-l^  + aes-3-l^  et-h£s-h' 


^€t-iet-hes-nes-h’  + aet-i£t-hes-j-l6s-h' 


+ aet-i-iet-h£s-jes-h'  + a et-i-iet-h€s-G-l6s-h^ 

Let  { cr e ( s ) } denote  the  covariance  sequence  of  the  e^'s,  so  that 

cr£(s)  = a-2  for  s=0,  and  equal  to  0 for  s^G.  Since  by  hypothesis  the 

e's  are  normal,  we  have  that  [see  for  example  Anderson  (l971a).>  Section 
t 

8.2] 
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(8,19)  cr  (i-h)  cr  (h'-j)  = , i=h  and  j=h' , for  every  s and  t, 

cre(s-t+i-j ) cre(t-s+h' -h)  = cr11  , h’=h+(j-i)  , for  s-t=j-i=h’ -h, 
crg(s-t+i-h'  ) cre(t-s+o-h)  = cr1  } h*=j+i-h,  for  s-t=h' ~i=g-h, 


and  all  other  possibilities  vanish.  Proceeding  in  a similar  way  with 
the  other  three  terms  of  (8.17),  we  conclude  that 

(8.20)  ^^t-i^-h'^s-^s-h' ^ = ^ > i=h  and  for  every  s and  t, 

4 

= ctcr  , i=h  and  j+l=h’,  for  every  s and  t, 

or  i+l=h  and  j=h',  for  every  s and  t, 

2 4 

= ot  cr  , i+l=h  and  j+l=h»,  for  every  s and  t, 

= cr\  h,=h+(j-i)  for  s-t=j-i=h’  -h, 
or_h'=j+i-h  for  s -t=h * -i=g -h , 

= acr4,  h*=h+l+(j-i)  for  s-t=j-i+l=h*  -h, 
or  hT=i+j+l-h  for  s-t=hf -i=j+l-h, 
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or  h1  =h'-i+  (j  ~i  ) for  s-t=j  -i-l=h’  -h, 
o£  h1 =i+j+l -h  for  s-t=h! -i-l=j-h^ 

O II 

= OC  cr ' } h'  =h+  (j  -i ) for  s~t=j  -i=h*  -h, 

or  h1 =i+j+2-h  for  s-t=h* -i-l=j-h+l, 
= 0 , otherwise . 


Note  that  in  the  last  three  equalities.,  t and  s are  restricted  by- 
conditions  such  as  t-s=i-jj>  t-s=h-h*,  or  the  like;  hence  there  are  less 
than  2T(k+2)  nonzero  contributions  and  as  T -»  oo  their  total  contribution 

to  (8.l6)  remains  bounded.  That  is  not  the  case  for  the  first  three 

2 

equalities  though.  We  analyse  these  first.  Let  us  take  cr  = 1 again. 

The  contribution  of  the  first  three  lines  of  (8.20)  is  T times 


f d. 21 , 


i d 


x j ' a. 


(l  -a2)a2k+2 


1 2 >■ 


l-a 


2k+2 


+ a (~a)~  (i+'3+1)  + + a2 


(-«)' 


■ (i+j+2)i 


J/ 


' = 0 . 


For  fixed  i and  j > i,  there  are  T-(k+l)  + l-(j-i)  cases 
where  t-s=j-i,  and  similar  numbers  when  t-s=h-i,  etc.  Hence  as  T 00 
such  numbers  divided  by  T tend  to  1,  and  hence  for  j > i (8.l6)  is 
equal  to 


lit 


k+l-(j-i)  min{k-(j-i)+2,k+l} 

(8.22)  (1-KX  ) f 7.  7-u , / . . n + « 7 7 7 , 

lib  h h+fa-l)  h=max(o7i-(o-i)}  S h+Cj-i)-1 


+ “ I'WlJ-l)*!  + I Vi+J-h  + 20  I YW.h 


“ l Vi+3+a-h  • 


The  sums  in  (8.22)  are  evaluated  as  follows: 


k+l-Q-i) 

(S.23)  Jo  rhV(3-i) 


7^7  • • + 

0 O-i 


k+l-Q-i) 


h=l 


(l-g2)g2k+2 


i-g 


2k+2 


(-g) 


-2h-(j-i) 


= yA  . +-JkStL-  (^^k^-Q-i)  k+1^“i}a2(k-h) 


0 “1  2k+2  2 

(l_g2k+2) 


h=l 


(1-g2)  f }2k+2+(d-i)  1_a2k+2-2(j-1) 

(l-a2k+2)  ^ 


k+2“^-l)  a^2)  (ja)2k+i+a-i) 


(8-24)  l Vh+(M)-1  = 70-i-l  + 


h=0 


(l_g2k+2) 


x _o;2k+4  -2  ( 0 -i ) 


> 0 > i 


i-g 
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(8.25; 


k+l-(j-i) 

h=l-^j-i) 


P pv 

(1-a  ) / rvs2k+3  1-a  „ . 

7h7h+(j-i)-l  71  o ( ) o > J-1  > 


{l-a 


2k+2  ■ 


1-a 


(8.26)  r'vh*(H)€  = * 


2 


- 1 (^)2k+3+( j-i: 

d^2k+2) 


1-a' 


2k -2 (j -i ; 


3 > i 


1-a 


In  the  fourth  sum  in  (8.22)  we  have  that  0 < i+j-h  < k+1  if  and 
only  if  i+j- (k+1 ) < h < i+j,  so  that  the  sum  is 


min{i+j,k+l}  i+j 

, fn  * + • rv+i  'll  7h7i+j-h  ’ 

h=max{.0,i+g-(k+l)  j h=0 


i+j  < k+1  , 


(8.27! 


k+1 


I .Vi+j-h  ’ i+i  > k+1 


h=i.+j-  (k+l) 


Using  the  same  type  of  argument  we  are  led  to  evaluate  the  following 
sums : 


i+J 

(8.28)  JT  j^y. 

h=0 


h'i+J-h  = 271+l&+3-l)(-«) 


4k+4-(i+j ) 


(1-a2) 

(l-a2k+2)" 


i+j  < k+1  . 
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3 


k+l  . > N , 2s 

(8.29)  I yh71+1  h:.  [(2^3)-(itj)](-a)te+1*-(i^)  , IkS.  ) 

h=i+j-(k+l)  n 1 J-n  (lja2k+2x2 


i+j  > k+l 


(8.5°)  I Vi+j+l-h  = 2’'i+j+1+(i+3)(^)1'lt+3"(1+J)  - ik^}  p > < k , 


(l^!2k+2) 


= [(2k+2)-(i+j)](^) 


4k+3-  (i+j ) (1-a  ) 

(l^2k+2)‘ 


; i+j  > k . 


(8.J1)  I 7h7i+J+2_h  ■ 27i+.+2+a+3+l)(J7)4ll+2-(i+^  (1^  > g,  , i+J  < k-1 

n (i_a2k+2) 


[(2k+l)-(i+j)](^)te+2+i+d)  , i+j  > k-l 

;(l-a2k+2) 


Note  that  as  in  (8.21), 


(8.32) 


7 . . + 237 . . . + CC  7 . . = 0 

'l+J  /i+J+l  / i+J+2 


With  this  background  we  now  find  f.  „ and  f.  to  use  in  (4.27). 

J-  1 


117 


(8.33) 


(8.34) 


(8.35) 


„2  2 


„2k+2 


fm  ' (lra2){v(^^)5-“)2k+2 


l-a 


1-0! 


2 2 


1 Wk+2' 


Pk 

(Jl)aw  w ,+  a 

l-a 


5n + (-Users’)  ( >2k+3  ^ 


1 ^2k+2' 


1-os 


} 


= CL-w2)^  7i+('  -lira')2  {(i-a2k+2)d^2)+2a(^)(l-«2k 

1 -0(  L 


(l-K72)-fQ!2k+2  1-Q: 


2 f 


l-a1 


2k+2 


2s  l-ta 
K l :J  -a  ) 

l-a1 


2k+2 


2k+2 


i+1,1 


2)71^70^72  = a - ^+2 


l-a 


{ (14a2)  (-a)"14a(-a)’23 


= a - (-a)2k+3 

l-a2k+2 


f . •=  (nar)y  -tay  n+a 7 n + 

i,i+r,l  v t 'r-l  ' r+1 


(l-a2)  (-a)2k+1+r 

(l-a2k+2)2  ^ 


{ (na2)  ;(«a ) (i-a2k+2 “2r ) -ta  (1  -a2k+i| “2r ) +a  (_a  )2  (l  _a2k+2 "2r ) ) 


= 0 , 


r = 2,3, • • - ,k-l 
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(8.36)  f.  ,2  - 27k+ta7k+1  + 


1-0! 


1-0! 


2k+2 


|(k-l ) ( -a  )3k+1++  k2a  ( -a  )3k+3  -ho;2  (k+i ) ( -a  )3k+2} 


l-a£ 


1-a 


,2k+2 


|2+4a(- 


N-l 


f (-a) 


k+2 


2(0!) 


,k+2  l-a" 


i+d=k 


(8-37)  f±-2  = 2rk+1  + (-  1^+g)  |k(^)3k+3+20!(k+l)(oi!)3k+2k):2k(^)3]£+1| 


37(^)k  ,(l.-Q2)a-a2k^)  , 

(l^2k+2) 


i+j=k+l 


By  the  same  type  of  substitutions  it  is  easily  verified  that 


(8.38) 


fj_j2  ~ 0 > i+J  < k+1  or  > k+1  • 


Since  F is  symmetric,  this  completes  the  proof  of  (4.28)  and  (4.29). 
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3.4  = 3 Part  3 [Asymptotic  variance  of  /l  -«*)]. 

Using  (4.30)  we  first  evaluate  the  partial  derivatives  to  be  used  in 


(4=31). 

k 

2 k-1 

(8=39) 

8a* 

(P*  +p?  , ) ) 
'o-l  O+J-'^Q 

P*  - 2pf  I P*P* 

1 o ,rn  1 1+1 

, i - 1,2=.. . =k  , 

* 

^0 

" k 

(I 

<> 

where  from  (8=7)5  P*  = 1 and- 

The  sum  Y k"^  P*P*  was  evaluated  in  the  proof  of  Theorem  4.1, 
and  a similar  calculation  shows  that 

(8.40)  | p*2  = (i^2k+2)“2(l^2)“1[  (i^2k+2)(l-«2k'+4)-2(k+l)a2k+2(l^2)] 

i=0  1 

Hence  we  have  that 

(l^2k+2)(  | p*2f2  I[(^)j“1(l^2k+4"2;j)  + (^)D+1(l-a2k™2j)] 

'0=0  0 

I pf-2(^)3(l-«2k+2-2d)  Y P*P*+1} 

0=0  3 j=o 

(l-a2 ) (i-a2k+2 ) {(i-a2k+2 ) (itx2k+4 ) -2  (k+1 ) (l-a2 )«2k+2} 

,f  (_a )*■ (1W2 ) (i^2k+2“2'j ) [ (i=a2k+2 ) (i^2k+4 ) -2  (k+l ) (l-a2 ) 

a2k+2]  „p (^) j+1  (i-a2k+2“2^ ) [ (l=a2k)  (na2k+4 ) -ka2k (l-a4 ) 1 

= (^)'^"“L(i-a2k+2“2^)(l-a2)  \k  , 0 = 1,2, . . . ,k  , 


8a 


(8.4l) 
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where 


,0  ir>.  , _ IX  „2i+2>  aa2k+2(na2)-(na2k+2)(naak+4) 

k [ (1  -a2k+2 ) (na2k+4 ) -2  (k+a  )a2k+2  (l  -a2 ) ] 2 ' 

o 

With  the  notation  of  (4.l4)  with  cr  = 1.,  the  elements  of  H defined 
in  (4.25)  are 


h.  . 


I I o-lmf  crnJ 

L-  u mn 


(8.43) 


k k 

I I 

m=l  n=l 
k 


Llll 


V im  mg'  / v im  m+x,g^  i,m+±  mg  v 

2_  cr  cr  +f  (£_  c r cr  ,0+cr  ’ cr  0 ) 


k-1. 


im  m+l^g"  i,m+l  mg' 


m=l 
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m=l 


k-1 


+ fl,k-l,2  I ^ cr 

’ ’ m=l 


im  k-m,  g 


k 


+ flk2  \ criV+1-"’') 


* flnl  I * WE 


‘111  i.  “ “ ■ *121' 

m=l  m=l 


+ f y ^ + f y 

l,k-l,2  A,  lk2  A.  * 

m=l  m=l 


the  latter  because  cr°^  = 0,  crk  = 0^  and  hence  we  can  include  the  k-th 
summand  in  each  sum. 

Substitution  in  (4.31)  gives 


v 


(l^2) 

a2 


k 


| (^  )i+^  (i-a2k+2~2i ) (i-a2k+2-2J  )h 

i^  0=1 
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-~g2)  ,2 

a2  k 


4 


| (^)1+3  h =a2k+2  | (^)-i+jh.  . 

— 1 Xjjj— 1 


(8.44)  -a 


2k+2 


k 


„4k+4 


k 


I (-arjh  +a—  I W1'3^ 
i,J=l  J 1,0=1 


l1.^-2.)-  ^2I  £ (-a)1+^  h.  . - 232k+2  J (-a)^-1  h.  + a 


4k+4 


QC 


2 Xk 


i*3=  1 


hi3J 


j=l 


13 


because  h. . = h . . . 

13  3i 

The  sum  inside  the  square  brackets  will  now  be  written  in  terms  of 

the  f. . introduced  in  (8.43).  and  hence  will  contain  the  four  terms 
13s 

that  will  be  calculated  in  the  sequel.  The  first  such  term  is 


flll  I I aimo-mj'  [(xt)1+^23!2k+2(^)J“1-^4k+4(-a)":L^} 

m=l  i5j=l 


(8.45)  = fm  I \|  I (-a)V 


1 

i im 


m=l  1 i=.l 


-2CC 


2k+2 


+ a 


4k+4 


k 

I 

i=l 

k 


, im 

(-0; ) or 


i w 


vm  1 


L 3=1 


I (-a)- 


Vm 


3=1 


By  direct  evaluation  we  find  that  for  m = -1.,  2 j, . . . , k* 


(8.46) 


k 


2 (_a)i„im  _ m ( -a )m (l^2k+2 )+  (k+1  gzz L LsTz L ± ) Z1 


„2k+2  r , 


-mn 


cr 


i=l 


(l-a2) (i-a2k+2) 


(8.47)  t (-a)' 

3=1 


m(-a)~m(i^2k:2)+(k+l)[  (-a)m~(-a)~m] 
(l-a2)  (l^2k+2) 
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o.r^iiv 


(8.148) 


I (-a)V 


i im 


i=l 


m2a2m(l-a2k+2)  +2(k+lft2k+2(lM2k+2)(na2m-m)  + (k+lfahk+h(ct2m-KX~2m-2: 


(l  -a2 ) 2 (l  -a2k+2 ) 2 


(8.49) 


I (-potV* 

U=i 

= m2a~2m(l-a2k+2)  +2(k+l)(l-a2K+2)  (m-no:~2m)  + (k+i)2(a2m4Q:~2m-g) 

(l^2)2(i^2k+2)2 


(8.50) 


k . .1 

I (-a)Vm 

,i=l  J 


I (^)-Vm 

U=i 


_ m2  (1  hg ) -m  (k+1 ) (l  -a2k+g ) + (k+1 ) (l  -a2k+2  )np:2m-  (k+1 ) (l  -a2k+g  )a2k+2n pT2m 


(l-a2)  (i-a2k+2) 


+ (k+1 ) 2a2k+2  (a2m4o:  gm-2 ) 

(l-a2)  (i-a2k+2) 

2 2-1 

Hence  the  factor  of  fli;L  in  (8.45)  is  [ (l-a2)  (l-a2k+2)  ] times 


| ( m2a2m(l^2k+2  42— 2mr-A- , ,4~2k+2 ,,  „2k+2  N , _2k+2  N ,,  ^2k+2  . 
m=l  4 


[ 2 ( k+l)a (1  ) +aa^  (k+l ) (l -a*  ^ ) ] 


+ a2”[  (M)^k4*(w)t»ktl't2.*k*1>(k,i)!]^a-2"(iJIa«>)«‘lW 


-2mr  _4k+4 


2k+2 1 _4k+4, 


+ ncT~“[  -2cT—  (k+1)  (l-a^)-zx^  (k+l)  (l^2k+2)] 


(8.51) 


+ a-2m[  (k+l)2a4k+\(k+l)Vk+4+2a4k+4(k+l)2] 


2 ?k+2  ,n  2k+2 


+m[-2(k+l)ct2k+2  (i-a2k+2)+2a1|k+lt  (k+l)  (l  -adK+d) 


„2k+2 , 
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- 222k+2(k+l)(l«a2k+2)  ]-2(k+l)2a4k+4-2(k+l)2a4k+1+-lja4k+4(k+l)2J  . 
Next  we  note  that 


k 


X m2a"2m  = a"2k  X (m«k+k)2a2  (k'm^  = cT2k  X (k-m)2a^k'm^+k2 

m=l  m=l  Lm=l  m=l 


k 


k 


=2k  £ (k-m p2  ^k“m') 
m=l 


(8. 52) 


= a *2k[  J m2a2m+k2  X a2m»2k  X mC^m-k2o;2k-k2a2k+k2+2k2a2k 

m=l 


Lm=l 


m=l 


= a 


-2k 


( X m2Q!2m-2k  X mC^m+k2  X a2m+k2 

» m=l  m=l  m=l 


(8.53)  £ met 

m=l 


■2m  2m  £ ?ni  . \ „-2k 

= - X ma  +k  X a +k  I Oi  , 


m=l 


m=l 


(8.54)  Xa~2m=a~2k 
m=l 


k 


a2m+(i-a2k) 


Lm=l 


Substitution  in  (8.51)  leads  to 

X jm2a2m(i-a2k+2)  +m.cPmk  (k+l  )a2k+2  (i-a.^2 ) 4a2m4  (k+l )2a4k+4 
m=a  L 

+ a2k+4  (i=a2k+2 ) [m2a2m-2kna2ni+k2a2ni]  -4a2k+4  (k+l ) (i-a2k+2 ) [ -rta^+MX2®} 

+ 4 (k+l)2a2k+4a2m+m22ci:2k+2(i-a2k+2)  -m4  (k+l)a2k+2(l-a2k+2)  j 

- 8k (k+l  )2a4k+4+k2o:dk+4  (i-a2k+2 ) -k4  (k+l)p:2k+4  (i_a2k+2  )+4a2k+4  (k+l )2 

(l-a2k) 
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+ 1*  (kfl)  2a2k+lt]+m2aJ2k+2  (i^2k+2  )2-mk(k+i  )a2k+2  (i  ^2k+2  )2} 

(8-55) 

= -8k(k+l)2o;4k+4-to!2k+i|[k2(i43:2k+2)2-i+k(k+l)  (i^2k+2)+4  (k+l)2(i-a2k)  ] 


+ ao:2k+2(i^2k+2)2  | k(k+i)(2k+i)-4(k+l)a2k+2(i^2k+2)2  |k(k+l) 

+ (l-o:2k+2 )2 (i-ta2k+4 ) a 2 (l-^2)(l-a2k)-ko^k[k(l-a2)+2]  (l-a2) 

(i-a2)3 

■+  2a2k+2(i^2k+2)[2(k+l)(l^2)_kc?(i^2k+2)]  a2  a^2k>k(l^2)a2k 

(l^2) 

+ a2k+4  |4(k+l)2(lk3'2k)+k(i^2k+2)[k(l-a2k+2)-4(k+l)]|  a2  , 

^ l«oi 

2 2m  2m.  2m. 

where  we  have  summed  £ m a , £ ntf  and  £ a . This  expression 

can  be  rearranged  to  read 
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a (l4a  )(3--a  ) (i_a2k+2)  (na2k+i+ ) + a2k+2  I"  (i-oi' 

o 3 L 


2k+2 ' 


(l-a2) 


k2a2-i  (4k+5)k(k+l) 


. te(k+l)a2(l^2k+2) 


(8.56) 


(l-a2) 


2 4 

, Pk+2^  ^,2ks  ka 

+ (l-a  ) (l-a  ) — — 

l-a 


2k+2)fi^2k)  '342L-  ko? 


+ (i-a  ) (l-a  ) 


l-a 


l-a  li-a 


+ (l^2k)  toVl)2  [1  ♦ a4k+4  [(l^2k+2)2 

_ ‘.(l^2ktg)(^l)k(14a2|  _ 8fc(k+1)g\  ( 

l-a2  J 


and  hence  [ (l-a2 ) (i-a2k+2 ) ] “2  times  (8.56)  is  the  coefficient  of  fm 

in  (8.45). 

The  remaining  terms  inside  the  square  brackets  of  (8.44)  are 


f y V [o-imcrm+1^’+cr1?m+1crm^][  (=a)i+<^-a2k+2  (-a)1"'5-a2k+2(-a);j"1 

1 P"j  /L-,  dLr* 


121 


m=l  i,j=l 


+ ct^C-o)-1'3] 


= 2f  t 
121  •L 


m=l 


k 

X 

Li=l 


(~a)Vm 


X (“«) 

•J=l 


r 

Li=l 


=aa2k+2  | («a)_1crlm 


k 


K-a)^ 
J=l 


d m+1,3 


i 

a 


(8.57) 


+ ci4k+4['  y (■a)'Vmjj'  I (-ar3o-m+1'JI 


L 


i=l 


-0=1 


where  we  used  that  = cr^1  , 
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ll,fc-l,2 


k rr  k . . ir  k 

I I(^)Vm 

m=l  lLi=l  J 1,1=1 


]T  (_cc)Jo-k“m^  0 


» 2 a 


2k+2 


r k 

I 

i=l 


(-a  )-Vm 


(8.58) 


k 


I (“«) 

■0=1 


+ a 


4k+4 


k 

I 

Li=l 


I (-arVm 


■ k 

I 

•0=  1 


cr 


lk2 


k f 

I 

1 (~a)V-m' 

| (^)5a-k+1"m^‘ 

-2Q?k+2 

| ( -a ) “icrim 

m=l  [ 

.i=l. 

-0=1 

-i=l 

(8.59) 


k 

£ i-uj”cr 
0=1 


(-a)^k+1-m^' 


+ a 


4k+4 


• k 

I 

•i=l 


I (-a)"Vm 


k 

I 

‘•0=1 


£ (^)^crk+1“m^ 


These  expressions  are  evaluated  as  was  (8=4-5).  Finally  we  obtain  for 
(8.44)  the  expression 


(1-a  ) .2 

2 *°k 

a 


(l-a2)  (i-a2k+2) 


{(l«2)  + c<2k+2  .(l..-“2) [3^g^tg (no*2 ) ] | f 


(I^2k+2) 


(l-a2) 


+ «2k+\1^4k+\a} 


(8.60) 


22 


1 
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■{'2(-a) 


k+2  W ]/•,«*■  2. 


l-a' 


ife  j | )3fct%2+ (^-)51c+4 


+ ^2a(-a 


)k  (l-a2)  (i-a2k+i|) 
d^2k+2)2 


} {(-“)kt3Al(1+  )3k+3A1(2+  (^)5k+\3}  , 


where  A^  and  A^  are  easily  recognized  in  (8.56)  and 


a = (i^2k+2)2(k(k*i)  . jl, 

21  L 6a2 


(8.61) 


- (1^2k+2)  kg(l^)^k  . (lx(2k)  _0? 

(l^2)  ^ 


l+2k 

l-a2 


(i„a2k+2)  |(i-a2k)  (k+l) 


a2  (i-2k)  + 8a2 

1‘a2  (ixc2)2 


k(k+l) 


-7k  (k+1) 


+ (l-a 


2k  ^ 4 (k+1)' 
l-a2 


a..  = -te(k+i)2a«2)+d^2k+2)2 
22  l-a2 


(8.62) 


+ (i^2k+2)k(k+i)  + (1J22k)  , 

l-a2  l-a2 
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A,n  = (l^2k+2W  . (l«a+lt) 

')±  L Zx 


k (k+1 ) (2k+l) 
6a2 


+ (l^2k) 


2 <k  " 

2\  l-a  J 


(8.63) 


(l-a  ) 


(l^i+Mjk(k+i)+(i.«2k)  ieiJs!  (3k  . -4)] 

(-  l -a  l -a  J 


(lXC2k+2)  (l=a2k)  SMiEiiJ,  } 

l-a2 


A_2  = 4k(k+i)2a2+(i-a2k+2)  I i-  k2(k+l)a2  - 

>-  l-a2  ,,  ~2x2 


(l-ac) 


(l-a2k)4(k+l)2a2  i!2L 
i-a£ 


(8.64; 


+ a-a2^2)  ((i«2k+2)  + (1JJ2k)  fefeaJg 

l l-a2  l-a2 


k (k+1)  [ 2k (l-a2 ) - (l-fa2 ) ] j 


(8.65) 


= 4k (k+i)2  , 


„2k+2 - 


A4l  = (l-a^)  1 2k  + - (na“TC  ) 


k(k+l)‘ 


2a 


2k+2  ^ k(k+l)(2k+l) 
6a2 


(8.66) 


2k  x 2ac 

+ (l-a  ) —t 
l=a£ 


k + 


k-l  ljor 
2 2 
^ (l-a2)  2 
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+ (l^AAkfk-aMl^A  (3k  - -§^) 

L i -or  a (i-£ i ) 


2k+2w,  ^kx  3 (k+l)' 


(l-a^Kl-o^) 


l-a 


A, 


'42 


- 4k(k+l)2+a-a2k+2)2f|  k(k+l)2  - 

1 l-a  /,  ™2 \ 


(l-a2) 


+ (1  -a2k+2 ) (na2k+2 ) 


(8.67) 


2k+2  ^ k(k+l)  (30^+1)  + (lJa2k)  (k+l 

l-a2  l-a2 


. (1^k,  8(kxl)V  ( 

1 JDC 


(8.68) 


a43  = Uk(k+1)  . 


By  operating  with  these  components  one  obtains  the  form 


(8.69)  v = (l-a2)  \2(l-a2k)  + 


Tc 


(l^2) 


(l-a2k+2) 


2 2 
a 


■{ 


a2k+2B  na^k+1+B  -ta^k+^B, 
12  3 


} 


where 


(8.70)  B1  = (i-a2k+2)/k(k+l)(lija2-ta2+k)+(l-a2k)  — ^L__  [k(i-a2) (i-2a2-2k)+3 

^ (l-a2) 

- ioa2+5a4+(i-a4)]-(i-ta2k+4)  j k(k+l)  (2k+l)  (l-a2)} 
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+ 


(i-a2k) 


~2(k+l)a 2 Mk+1 ) + (l  ^2 ) [3k~2  (k+1 ) f l^a2 ) 1 

l -oP 


v ' 2 
l-a 


(l+a2k+2 ) g(k+l)a4[3k(l^2)-4]  + ^^k+4^4  2(l-k)+ka2(3-a2) 
^ (l-a2)" 


y2k+2Ulary2k+4,  a2  (l-*a2 ) l 

l-a2  J 


+ (l-a211  2)(nadk  4) 


+ (i^2k+4)  ^-k(i^2)[ia2+(k+i)2]  + (na2k+2)  l k(k+l)(2k+i)(i-a2)j 
+ (i-*a2k+2)  6a2(i-a2)k(k+l) 


2 ( 

+ (l^2k+2)  | j (l-^2)[3k2a2.(4k+5)k(k+l)]+|-k2(k+l)[3(k+l)+5a2(k-l)] 


, 2k2a2 
l-a2 

+ (l-a2k)  -2lL_  [ k2  (l^a2 ) - (2k2 -l ) ] - (l ~a2k+!*  )k  2±*fl^2 


l-a 


l-a 


>) 


l-a 

l-a 


2^2  |(1  -«2k  )6  (k+1  )2a2  - (na2k+2  )6k  (k+1 ) (l -a2  )a2  - (i  _a2k ) (i^2k+2 ) 

g«2  (k+l ) 3KLji^g)-y2-i  | 

i _a2  J 
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(8.71) 


i2  = 2k (k+1 ) (l-a2)[  (l-a2)(2k-l)-ta2] 


+ (i-a2k+2)f [ (l-a2)8ka2-2(k+l )(l-a2)[k (I-90? )+2k (1-fa2)2 

l (l-a2)  ‘ 

+ 2kO?(i-a2)  (l-a2)  wt2  |2k+  (k+1 ) ™^|-(l-a2) 


k(M)]-(x«2k+1,)2k  Si %ik^L 

(l^2) 

+ (X-a2k)  ^—r  2X2 [ 1 -k (X xt2 ) ] 4- (x«x2k  2 g4-k.&i^.2| 

1J2  (x-a2) 

+ (X-a2k){  2 (k+1 [-4(ic+3)c?+(5+3a1*)]  + (x+<^k+2)kCi1,[k(x-ci2)-2]) 
l 1-a2  j 

+ (l-ta2k+2)'|  8k(k+l)a2+(i-a2)  jk2a2  - I^kt£j£fe-tilj 


- (na2k+4)k[k(l-a2)+2]  ■ 


2^  (k(l-«2)+2)l 


+ (l-a2k+4)2k(l-a2)I  ^ 

*■  l-a  (l-a  ) 

♦ (1^2k+2)2  2k£? 


l-a 


.2k 


+ 8k  (k+1)2  (1-a2  )a2  + |»4a2(k+l)23  (l+o^) 


l-a 


l-a 


+ (na2k  )8  (k+1  )2a^ 


+ (i-ta2k+2  )4  (k+l  )a2  [ (1+a2 ) -1a2  (1-a2 ) ] - (i-a2k+i|  )2  (k+1  )2a2  j 
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l-a 


l-a 


l-a 

2k+4 


(l-a2S+2) 


f -8k (k+1  )2  (l-a2 )+  (i-a2k )i 6 (k+1  )2a2 j“ 


■ ^-40t  p — f 4 (k+l )2  (l -a2 ) a2+  (na2k )4  (k+1  )2  (l-a2  ja^' } 

OV4-0  . I J 


(l-a2k+2) 


(8.72)  = -10k  (k+1)  (l-a2)-(i-a2k+2)2ka?+(na2k+2)2k2(l-a2)a2 

+ { "8k  (k+1)2  (l-a2)  (24a2)-  (i-a2k)8(k+l)2a^ 

- (I4a2k+2)4k(k+l)(l-a4)| 

+ 1 — - f - (i4a2k+2  )8k  (k+1  )2  (l-a2 ) -(l-a2k+i+)8k(k+l)2(l-a2) 

(l-a2k+2)  1 


This  completes  the  proof  that  (4 .31)  is  given  by  (4.21)=  Q.E.D. 


8. 5 Broof  of  Corollary  4.7  (Section  4.3 )» 
From  (8.42)  we  have  that 


(8.73) 


x = ^2k+2j  2a2k+2  (14a2 ) - (I4a2k+2 ) (ik/k+^ ) 

[ (l-a2k+2 ) (i4a2k+i|)-2(k+i )a2k+2  (l-a2 ) ] 
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= (i^2k"2)t-i4a2k+2(i-HD'2)+o(a2k)]  [i-a2k+2(i-a2)  (2k+3)+o(o:2k)]“2 
= [ _i-ta2k+2  (2-kx2  )+o  (a2k)  ] [i-sx2k+2  (l-a2 ) (2k+3  )+o  (a2k)  ] 1 
= [-i-+a2k+2(24Q:2)+o(a2k)3[l+2x2k+2(l-a2)(2k+3)+0  (a2k)] 

= -l+c^k+2[  (2+a2)=2(l-Q:2)  (2k+3)]+0(«2k)  , 

so  that 

(8.74)  X,2  = l-ta2k+2[4(l-a2)(2k+3)-2(2-ta2)]+0(o:2k)  . 

Substituting  in  (4,21)  or  (8.69)  we  have  that 

(8.75)  v = (l-a2)(l-a2k)|  l+a2k+2[4(l-a2)(2k+3)-2(24a2)]+o(a2k)J 

+ ^~l-a2k+2  |l-K5:2k+2[4  (l-a2)  (2k+3)-2(2-ia2)]+0(a2k)) 

a2  1 4 J 

[ 1 _2c£k+2+o  (a2k ) ] +0  (a2k ) 

= (ixc2)  |i-a2k[i_4a2(l-a2)(2k+3)+2Q2(2-Ki2)]+o(o!2k)| 

+ (1-a2)  a2k  B^°^+o(a2k) 

= (l-a2)|  i-a2k[i-8a2+ lta^-Sk^-a2^2]}  + (l-a2)  a2kB^°)+o(a2k)  , 

which  is  (4,32),  Q.E.D. 
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9°  MATHEMATICAL  DETAILS  CORRESPONDING  TO  CHAPTER  9. 


9« 1 Proof  of  Theorem  5° 3 (Section  5.2). 

Letting  cr  =1  without  loss  of  generality.,  and  using  that  P ~ (~a)  , 
we  find  that  for  i, j > 2 


(9-1) 


a.  . = lim_  £u.u„ 
ij  x-»oo  ^ 1 3 


a.  +a.  „ +a„  „ 
ijl  ij2  xo3 


.? 


where 


aijl  = [ (-a)1"1+(-«)1+1][  (-a^^+C-a)^1]  (1+92  ) 

+ {(-a)4[  (-a)1“’1+(-a)1+1]  + (-a)1[  (^:)J“1+(-a)']"+1]}]p;(n«2) 

(9.2)  + (-a)1+<:j2(l+4oF4ai|) 

= (-a)1^"2  {(i4a2)2(i+5c^4a^)-8o^(i+Q2)2+2c^(i+4a24ai+)) 

= (^)1+j"2[a2(i4a2)2+(i«3!4)2]  , (i,j>i) 


aij2  = ( -a  )12°^  + (-0: ) 1 _1  (i-kx2  )2a(i4a2 ) 


(9-3) 


= -2 (-a)1  (i-ta2-to'1+ ) , 


= (-a)1-1  (i-t-aP)Q!2  = (=a)ixl(na2)  , 


(9.4) 


aij3  = -2(.-a)J(i+o?+a4) 


= (4X)J+1(l+0?) 


d=2,  (i  > 1) 
J=3,  (i  > 1) 
i=2,  (J  > 1) 
i=3^  (d  > 1) 
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(9-5) 


a..,  = l+UoP+a1' 


i— j— 2^3  >• • • 


= 2a 


(l+a?) 


a . . — 0 9 

13s 


otherwise 


To  evaluate  a^  and  a^..  for  3 > 2 we  use  (5.14).  Combining 

these  results  we  find  out  that  the  a„  are  given  by  (5. 21)  with 

a. and  a. defined  above  holding  also  for  the  case  of  i or  3 
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equal  to  1.  That  is  why  we  included  the  value  of  1 in  the  ranges  of 


(9.2),  (9-3)  and  (9-4)  above. 

We  further  approximate  as  in  (4.4l) 


(9.6) 


— — (i-a2)2(-a)']''1 


.-a2)2 


(~a)^  } 3‘-l>2,...,k^ 


(9.7) 


13  (^)j~x  (l^21) 


3 > i 


r-  Sa*  Scf  y is  to 
) • — — ) cr  a J.CT 

1,3=1  aPf  SP*  s,t=l  st 

X J 


(l-a2  )4  f r^2  _2  \2  ,,1^2,  £ | 


{[a2  (i-ta2  )2+  (itf  )2]  £ ^ (-a)i+J(^)s+t-Vscrtj 

1,3=1  s,t=i 


136 


X (-a)1+t3[  (-2)(i+c^+a4)  y (-a)Vscr2J 

i ji  j =1  s=l 


k 


+ (-o:)(l+o!2)  X (^)S<x13cr^ 
s=l 


(9.8)  + (-2)  (l-ta2^)  X (-a)tcr:L2a-t;]' 

t=l 


+ (-a^i+a!2)  X 
t=i 


k-l 


+ (l+lja2+cf)  X (X^cr^^aCl+a2)  X o-13cr3+1^ 
s=2  s=2 


+ 2dr  X cr13aS+2^ 
s=2 


>} 


a 


f a2  (na2  )2+  (na1*  )2 

k 

I C^)1Di 

^ a2 

i=l  1 

i+ (l-ta2^  )d  X (-a)1^.  -2a(i[fQ2)D  X C-01)^. 

^ _n  1 3 • _n  1 


i=l 


k 

1 

i=l 


k-l 


k-2 


+ (1  +4a£+cT)  X D D +4a(i+a2)  T dd  n+2C^  J d D , , 
s=2  3 3 s=2  3 S+1  s=2  3 s+2  ^ 


4 


where  as  shown  in  (4,42) 
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(9.9)  D = I (-a) Vs  - 1^- 
s i=l  l-cr 


k i a2 

I (-a)  D 2T 

i=l  1 (i-a2r 


Then 


a 


2'4  | ag(i«P)2+(i+o.y  «4g  6 . 4 (l-H^-Kt1 ) 


(i-<r)  J a 


a" 


(l^)v 


2.S 


+ 2a 


(1+a2) 


3a3  ~2 


a 


(l+4o£+a4)  I g2a2s 


l-a2  (l-c?)^  (i^2)2  s=2 


4a(i+^ ) k-1 

(l-o? )2  s=2 


P I s (s+l)  (-a)2s+1  + 

2 ~ (l-cr)^  s=2 


2Q*  ? *f  s(s+2)a2s+2| 

^ n — O J 


(l^x2)^  (^2  a2  (i+o?)2+(i+qi4)2  Qa4  i+o£+a4  + 6a6(l+o£) 

aW  a-o?)r 


2TF 


(l-a  ) 


+ l s2a2s 
s=2 


l+4a24a4  , 4a(na2)  , , 2a2  „2 

p"o"  + — - — n-p  i-a ) + p“p  a 

. (l-a2)2  (l-a2)2  (l-a2)2 


+ £ sa 

s=2 


2s 


(^)  + „ g,2 


a ^2)2 


a^2)2 


(l-a2)11  r ~2 


a L(i-cr) 


- — (i_7a2+i8a4-5a6+3a8 -2a:10) 

L(l^2r 
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(9.10) 


+ ~^~2  I ^ 

(i -cry  s=2 


4oF 


(, l-a 2)  s=2 


00  _ 

E s0^£ 


(l-a2)11  a2 


a2  (l-a2)4 


l-7o£+l8a4-5a^+3a8-  2a1Q  (l-a11)  (4Q^-3a4 


(l-a2)2 


l-a 


\dr{2.S  -a4) 


1 

(l-a2)2 


l-7«2+l8a4  -5a6+3a8-2aLO+  (i_a2 ) (ixe4 ) (4c£_3a44a6 ) 


_ 4a2 (l-a2)2 (2c£ -a4) 


= - ■ 12  g [l-3Q^+3a4+i5a6-7oP_2aLO+aL2]  , 


which  is  equivalent  to  (5.24).  Q.E.D. 
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APFEMDIX  A 


A,  The  Finite  Autoregressive  Representation  for  q > 1 (Section  1.2). 

In  Section  1.2  we  derived  the  exact  representation  (1.17)  when  <1=1. 
We  want  to  extend  that  result  here. 

For  general  q we  proceed  along  the  same  lines.  From  (l.l)  by 
successive  substitution^  we  have 


y+ 


(_al)€t=l  + 


(-a2)et_2 


(-a  )e 

v q'  t-q 


(a.  1 ) 


+ (^2)et_2  +“*‘  + 


q'  t-q 


= yt-°iyt-x + + ^K-2 + 


+ [(-“j.)  (-“q.j.)  + + <J2q)€t-(q+l) 


It  is  then  clear  that  at  stage  k (k  = 0>1, ... ) we  have  an  expression  of 
the  form 

(A.  2 ) €t  = yfc  + ¥tJ  +•  • • + 7kyt_k  + Blk£t-k-l  +' ' ' + Wk-q  5 
substituting  from  (l.l)  and  (A. l)  above  yields 


(A.  3) 


£ =V  + f—CX  ) G + «>  «»  ® -I-  f ) € _ 9 

t-k-1  yt~k-l  V t-k-2  q;  t-k-l-q 


we  see  that 


140 


Sj,k+1  Slk  + 6j+l,k  ' 


3 — 1 > 2 ) • • • j Q[  ~1  ) 


(A-4)  V+i  * 8u,  (-V* 


rk+l  8 lk 


These  recursive  relations  are  the  same  as  the  ones  obtained  by  analysing  in 
like  manner  the  autoregressive  model!  see  Anderson  [ (1971a),  p.  168]. 

Hence  the  alternative  representation  of  (l.l)  is 


j=0  ~ 8J^k+iet-k-j  } 


where  the  coefficients  satisfy  (A. 4).  Denoting  as  before 


£t  " ^ 8j,k+let-k-j  ’ 


we  verify  easily  that 


(A.  7) 


^et9k  = 0 


for  all  relevant  t and  k.  We  compute  the  variances  and  covariances 
follows : 
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* 


6 


.* 


t,k  t+s,k 


= r 


Sj,k+lGt-k-g 


"t+s 


q. 

£ s, 


.i=i 


j,k+l  t+s-k-j 


(A.  8) 


= !?<= 


tGt+s 


.E  Bj,,k+1  ^^Gtet+s-k-„j  + et-k-g€t+s^ 
0 ~ 


& q 

+ n.,5i  BoA+i53'A+l  ^q-k-jS+s-k-j' 

The  independence  of  the  e^’s  implies  that 


(A.  9) 


t“t+s 


= cr 


s = 0 , 


(A.  10) 


Ctet+s-k=j 


2 

o’  > 


s = k+j  , 


(A.  11) 


^et-k-j£t+s 


s = -k=g  , 


(A. 12) 


"t-k-g  t+s-k-j’ 


2 

a-  , 


s-0' 


p 


and  equal  to  0 in  the  other  cases,  respectively. 
When  s = 0 we  are  left  with 


(A.  13) 


Var(6t,k>  = ^ 


1 + 


2 

hk+1 


y 


llf2 


and;,  as  in  the  case  of  q = 1,  Var(e*  ) > Var(e.  ). 

For  s f 0,  (A.  10)  gives  rise  to  a contribution  of  ht25 

s~k,k+l' 

provided  that  1 £ s-k  < q (i.e.,  k+1  < s < q+k)j  (A.ll)  gives  rise 

2 

to  a contribution  of  -cr  6 _g k+1,  provided  that  1 < -s-k  < q (i.e., 
k+1  £ ”s  £ l+k)|  finally  (A.  12)  gives  rise  to  a contribution  provided 
that  1 < s+j  < q (which  implies  that  j < q-sj  also  s = j-g'  implies 
that  | s [ < q-l). 

For  q > 1 it  then  turns  out  that  the  final  expression  for  (A.  8) 
is: 


q-l  si 


0ov<6£.k>€£+E. k>  = I ^ ,+r8. 


‘ ” A “j,k+lud-|s|,.k+l'  1 = 1 = 1,2,..., q-l 

tJ  “L 


(A. 14) 


a26i 


s -k  j k+1  ' 


s|  = k+1,.. .,  q+k 


= 0 ) otherwise  , 

with  the  convention  that  if  q-l  > k+1,  the  first  two  expressions  must  be 
added  to  give  the  covariance  of  lag  s,  when  s ranges  over  the  set  of 
integers  such  that  q-l  > k+1.  In  general  we  are  interested  in  values  of 
k very  large  compared  with  q. 

With  the  kind  of  notation  introduced  in  (1.22)  through  (1.25)  for  the 
ranges  in  the  set  {1,2, ...,T),  we  now  write 
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case  where  t 


(A.  15) 


Its  covariance  matrix  is  of  order  [T+l~ (k+q) ] with  as  components. 

The  diagonal  components  of  this  matrix  are  nonzero  and  the  components  within 

q-1  of  the  main  diagonal  are  nonzero?  the  other  nonzero  components  are 

from  k+1  to  k+q  positions  above  said  below  the  main  diagonal.  If  k is 

increased  the  gaps  between  the  three  sets  of  nonzero  components  are  increased. 

For  the  sake  of  completeness  we  write  (A.  14)  in  matrix  form,  using  G 

~s 

matrices  of  order  [T+l-(k+q)]  defined  in  (1.25): 


c e*i  - 

e-k  ~k  “ 


1 + 


<3=1 


*2 

D,k+1 


-T+l-  (k+q) 


(A.  16) 


q-1 


q-s 


+ a ~s  jli  5J,k+l&^s,k+l 


q-k 

^ 8s»k.k+l  ~s  * 
s=k+l  } 


We  conclude  that  the  general  moving  average  (l„l)  of  order  q has 
a representation  as  an  autoregression  of  order  k given  by  (A.  5)j  where 
the  error  term  e.*  has  zero  expectation  and  the  covariance  structure 
(A.  l6).  In  the  general  case.,  from  (A.  5)  we  have  that 
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00 

(A-18)  I 7iyt-i  = €t 

j=0  J c J t 

will  be  proved  if  )T  j_3_  S j converges  to  zero  as  k -*  co„  This  is  shown 

to  be  true  in  Anderson  [(1971a),  pp.  168-70],  Hence  we  conclude  that  the 
moving  average  (l.l)  is  equivalent  (in  mean-square)  to  the  infinite  auto- 
regression (A,  18). 

Notice  that  6.  , . , 0 implies  that  the  covariances  in  (A.l4)  tend 

to  zero  and  the  variance  in  (A.  13)  to  cr  , as  k tends  to  oo,  which 
provides  another  way  of  interpreting  the  transition  from  the  finite 
representation  (A.  5)  to  the  infinite  one  (A.  18). 
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